Here is a hands-on deep dive introduction through all the information I’ve found around Microsoft’s Artificial Intelligence (AI) offerings running on the Azure cloud.
My contribution to the world (to you) is a less overwhelming learning sequence, one that starts with the least complex of technologies used, then more complex ones. NOTE: Content here are my personal opinions, and not intended to represent any employer (past or present). “PROTIP:” here highlight information I haven’t seen elsewhere on the internet because it is hard-won, little-know but significant facts based on my personal research and experience.
Machine Learning vs. AI services vs. LLMs vs. Generative AI
AI is moving fast, so everything breaks. Like walking across a desert, we don’t get very far on our own.
Sign up for https://discord.com/invite/6GUBsZfMBq
Sign up for Microsoft for Startups Founders Hub to receive free OpenAI credits and up to $150k towards Azure credits to access OpenAI models through Azure OpenAI Services.
Review issues in the GitHub at
https://github.com/microsoft/generative-ai-for-beginners/issues
If you find something wrong in the content, add an issue in its GitHub.
Many give up entirely because they did not get their working environment properly setup before diving into the “For Beginners” courses.
This is covered in <LEARN modules certification AZ-104:
Using Command Line:
Install package manager Homebrew (if you’re on macOS).
Install Google’s Chrome browser.
On Microsoft's Edge browser:
Phone. If you can, generate a unique phone “burner” number.
Get onboarded to a Microsoft Azure subscriptions and learn Portal GUI menu keyboard shortcuts on the Azure portal at
https://portal.azure.com/?quickstart=True#view/Microsoft_Azure_Resources/QuickstartCenterBlade
https://portal.azure.com (Azure Cloud Shell)
Setup a CLI scripting environment in https://shell.azure.com like I describe in my mac-setup page
WARNING: A storage account is required for the CLI to work AND it costs money per month.
Open your Command Terminal. On macOS, use Terminal. On Windows, use cmd, Bash (Windows Terminal) or PowerShell.
Use CLI to Create a Cognitive Service to get keys to call the first REST API from among sample calls to many REST APIs: the Translator Text API.
On Windows 11, install Edge browser according to
https://microsoftlearning.github.io/mslearn-ai-services/Instructions/setup.html
The setup is for this LAB pop-up:
https://learn.microsoft.com/en-us/training/modules/create-manage-ai-services/5a-exercise-ai-services
Setup PowerShell scripts
View Hub in the Azure AI Foundry Management center
View Azure Cost analysis
Create a GitHub Personal Access Token to use Github Models Marketplace free access to LLMs used to create AI Agents.
Navigate to the folder for your GitHub account.
Add token & region to the .env file for this course.
DO THIS: Click “Star” and “Watch” for “All Activity” in each repo below:
github.com/microsoft/generative-ai-for-beginners (aka.ms/genai-beginners) “Learn the fundamentals of building Generative AI applications with our 21-lesson comprehensive course by Microsoft Cloud Advocates. 21 Lessons teaching everything you need to know to start building Generative AI applications”. (Version 3) Maintained by @bethanyjep. VIDEO series
Generative AI Code Samples - a collection of code samples as extra learning and materials from the “Generative AI for Beginners” course.
github.com/Azure-Samples/rag-data-openai-python-promptflow “ChatGPT + Enterprise data with Azure OpenAI and AI Search - Python”
AI-in-a-Box - leverages the expertise of Microsoft across the globe to develop and provide AI and ML solutions to the technical community. Our intent is to present a curated collection of solution accelerators that can help engineers establish their AI/ML environments and solutions rapidly and with minimal friction.
github.com/microsoft/AI-For-Beginners (aka.ms/ai-beginners) says it’s a “12-week, 24-lesson curriculum! It includes practical lessons, quizzes, and labs. The curriculum is beginner-friendly and covers tools like TensorFlow and PyTorch, as well as ethics in AI.” It’s an update of the:
github.com/microsoft/AI-For-Beginners (aka.ms/ml-beginners) “12 weeks, 26 lessons, 52 quizzes, classic Machine Learning for all.” using primarily Scikit-learn.
github.com/microsoft/ai-agents-for-beginners (aka.ms/ai-agents-beginners) “10 Lessons to Get Started Building AI Agents”
End-user Productivity of Human-Computer Interaction (HCI)
Microsoft “democratizes” Machine Learning and AI by providing a front-end GUI that hides some of the complexities, enabling them to be run possibly without programming.
DEMO: Hands-on with AI/Guidelines for Human-AI Interaction: Click each card to see examples of each guideline
https://aka.ms/hci-demo which redirects you to
https://aidemos.microsoft.com/guidelines-for-human-ai-interaction/demo
PROTIP: Some fonts are real small. Zoom in to read it.
PROTIP: Although most of Microsoft’s product documents focus on one technology at a time, actual production work enjoyed by real end-users usually involves a pipeline consisting of several services. For example: ingesting (stream processing) a newsfeed NLP (Natural Langague Processing):
That and other flows are in the Azure Architecture Center.
For more about ALGORITHMS USED, see my explanations at https://bomonike.github.io/machine-learning-algorithms, which lists them by alphabetical order and grouped by.
Case studies of how people are already making use of AI/ML to save time and money:
Create a recommendation engine (such as what Netflix) from the Internet Movie Database (imdb.com)
Click picture for full-page view.
In the diagram above, Microsoft makes a distinction between “Business Users & Citizen Developers” who use their Applications and “Power Platform” and geeky “Developers & Data Scientists” who use “Azure AI” in the Azure cloud.
In the diagram above, Microsoft categorize Azure’s AI services these groups (all of which have GUI, CLI, and API interfaces):
PROTIP: Several services are NOT shown in the diagram above. The list in a Microsoft LEARN module show a different order:
DEFINITION: There are what are called “narrow” or “weak” AI.
Click (Azure cloud) SERVICES USED:
Every product Microsoft offers is listed, by region in this table showing availability (regardless of whether you have permissions to each specific product).
The method shown in this section gives you a list of all “kinds” of services Azure currently provides for your Azure subscription.
After your Command Terminal environment on your laptop has the Installed prerequisites of obtaining an Azure account with subscription and installed the “az” CLI modules needed to run “az” commands on your Terminal, run this command:
az cognitiveservices account list-kindsResponse:
[ "AIServices", "AnomalyDetector", "CognitiveServices", "ComputerVision", "ContentModerator", "ContentSafety", "ConversationalLanguageUnderstanding", "CustomVision.Prediction", "CustomVision.Training", "Face", "FormRecognizer", "HealthInsights", "ImmersiveReader", "Internal.AllInOne", "LUIS.Authoring", "LanguageAuthoring", "MetricsAdvisor", "Personalizer", "QnAMaker.v2", "SpeechServices", "TextAnalytics", "TextTranslation" ]
PROTIP: “ContentModerator” has been deprecated.
PROTIP: The services above are listed in random order. But the table below groups services to help you quickly get to links about features, tutorials, and SDK/API references quickly.
As of this writing, in various marketing and certification training DOCS, Azure Cognitive Services are grouped into these (which is the basis this article is arranged. Click on the underlined and bolded category name to jump to the list of services associated with it, in this order (like on the AI Products Portfolio diagram:
AI Vision (Visual Perception) provides the ability to use computer vision capabilities to accept, interpret, and process input from images, video streams, and live cameras. Interpret the world visually through cameras, videos, images
AI Speech - Text-to-Speech and Speech-to-Text to interpret written or spoken language, and respond in kind. This provides the ability to recognize speech as input and synthesize spoken output. The combination of speech capabilities together with the ability to apply NLP analysis of text enables a form of human-compute interaction that’s become known as conversational AI, in which users can interact with AI agents (usually referred to as bots) in much the same way they would with another human.
AI Language - aka Natural language Processing (NLP) to translate text (Text Analysis), etc.
AI Decision (Making) provides the ability to use past experience and learned correlations to assess situations and take appropriate actions. For example, recognizing anomalies in sensor readings and taking automated action to prevent failure or system damage. supervised and unsupervised machine learning
Other: OpenAI (to power your apps with large-scale AI models) is a recent add to this confusing category because of so many branding changes (Cortana, Bing, Cognitive, OpenAI, etc.)
Text analysis and conversion provides the ability to use natural language processing (NLP) to not only “read”, but also generate realistic responses and extract semantic meaning from text.
PROTIP: The table below groups each kind of cognitive AI service along with how many FREE transactions Microsoft provides for each service on its Cognitive Services pricing page.
IMPORTANT PROTIP: Microsoft allows its free “F0” tier to be applied to only a single Cognitive Service at a time. To remain free, you would need to rebuild a new Cognitive Service with a different “Kind” between steps.
Within each grouping, each service is listed in the sequence within that group’s LEARN module.
Links are provided for each service to its Features and API/SDK pages.
“MSR” in the table above identifies a Multi-Service Resource accessed using a single key and endpoint to consolidate billing.
PROTIP: HANDS-ON: Create the CognitiveAllInOne resource using the GUI so that you can acknowledge Microsoft’s terms for Responsible AI use:
PROTIP: Apply of Terraform to create AI service resources will error out unless that is checked.
Microsoft’s AI offerings have gone through quite a bit of churn.
Microsoft has invented several names to refer to their AI offerings:
Cortana => Bing => Cognitive Services => OpenAI => Generative AI => AI Language
“Cortana” was the brand-name for Microsoft’s AI. Cortana is the name of the fictional artificially intelligent character in the Halo video game series. Cortana was going to be Microsoft’s answer to Alexa, Siri, Hey Google, and other AI-powered personal assistants which respond to voice commands controlling skills that turn lights on and off, etc. However, since 2019, Cortana is considered a “skill” (app) that Amazon’s Alexa and Google Assistant can call, working across multiple platforms.
For Search, the Bing “Bing” brand, before OpenAI was separated out from “Cognitive Services” to its own at https://docs.microsoft.com/en-us/azure/search, although it’s used in “Conversational AI” using an “agent” (Azure Bot Service) to participate in (natural) conversations. BTW: in 2019 Cortana decoupled from Windows 10 search.
Since October 31st, 2020, Bing Search APIs transitioned from Azure Cognitive Services Platform to Azure Marketplace. The Bing Search v7 API subscription covers several Bing Search services (Bing Image Search, Bing News Search, Bing Video Search, Bing Visual Search, and Bing Web Search),
Azure IoT (Edge) Services are separate.
The 2023 rebranding for Microsoft’s AI services to mimic human intelligence is “AI Language”, which includes Cognitive Services and Bing.
PROTIP: On Jan 8, 2024, https://aka.ms/language-studio had “coming soon” for Video and Learn, and “preview” for several services. Essentially Microsoft has two separate offerings by different groups:
Microsoft has three service “Providers”:
Asset type | Resource provider namespace/Entity | Abbre- viation |
---|---|---|
Azure Cognitive Services | Microsoft.CognitiveServices/accounts | cog- |
Azure Machine Learning workspace | Microsoft.MachineLearningServices/workspaces | mlw- |
Azure Cognitive Search | Microsoft.Search/searchServices | srch- |
Microsoft has been the “sole provider” of servers to OpenAI as part of some agreement that counts as investment.
In 2025, Microsoft was not part of the “$500 billion investment” announced in the Trump White House.
A few weeks later, Microsoft announced “Microsoft AI” (MAI).
In April 2018 Microsoft reorganized into two divisions to offers AI:
The research division, headed by Harry Shum, put AI into Bing search, Cortana voice recognition and text-to-speech, ambient computing, and robotics. See Harry’s presentation in 2016.
Microsft’s “computing fabric” offerings, led by Scott Guthrie, makes AI services available for those building customizable machine learning with speech, language, vision, and knowledge services. Tools offered include Cognitive Services and Bot Framework, deep-learning tools like Azure Machine Learning, Visual Studio Code Tools for AI, and Cognitive Toolkit.
At Build 2018, Microsoft announced Project Brainwave to run Google’s Tensorflow AI code and Facebook’s Caffe2, plus Microsoft’s own “Cognitive Toolkit” (CNTK).
BrainScript uses a dynamically typed C-like syntax to express neural networks in a way that looks like math formulas. Brainscript has a Performance Profiler.
Hyper-parameters are a separate module (alongside Network and reader) to perform SGD (stochastic-gradient descent).
Microsoft has advanced hardware:
Microsoft Conversational AI Platform for Developers is a 2021 book published by Apress by Stephan Bisser of Siili Solutions in Finland. The book covers Microsoft’s Bot Framework, LUIS, QnA Maker, and Azure Cognitive Services. https://github.com/orgs/BotBuilderCommunity/dashboard
Microsoft Mechanics YouTube channel is focused on Microsoft’s AI work.
VIDEO: What runs ChatGPT? Inside Microsoft’s AI supercomputer
NOTE: Wikipedia has 6,781,394 articles containing 4.5 billion words. English Wiktionary contains 1,439,188 definitions. Webster’s Third New International Dictionary has 470,000 English words.
https://azure.microsoft.com/en-us/services/virtual-machines/data-science-virtual-machines/
Microsoft Reactor ran a “Skills Challenge” to reward a badge for those who complete an AI tutorial.
https://learn.microsoft.com/en-us/training/challenges
https://www.youtube.com/watch?v=ss-kyogPRNo by Carlotta
Microsoft competes for talent with Google, Amazon, IBM, China’s Tencent, and many start-ups.
# | AI-900 Azure AI Fundamentals | AI-102 Azure AI Engineer Associate |
---|---|---|
- | ml.azure.com Azure Machine Learning Foundry Portal Exercises for MS LEARN FAW DOCS | ai.azure.com Azure AI Foundry Portal FAQ DOCS |
1. | AI Overview 3 hr 2 min |
Exercises:
Get started with Azure AI Services 5 hr 5 min |
2. | Computer Vision 1 hr 40 min |
Exercises:
Create computer vision solutions with Azure AI Vision 5 hr 1 min |
3. | Natural Language Processing 2 hr 39 min |
Exercises:
Develop natural language processing solutions with Azure AI Services 7 hr 4 min |
4. | Document Intelligence and Knowledge Mining 1 hr 19 min |
Exercises:
Develop solutions with Azure AI Document Intelligence 2 hr 3 min |
Exercises:
Implement knowledge mining with Azure AI Search 6 hr 24 min | ||
5. | Generative AI 3 hr 32 min |
Exercises:
Develop Generative AI solutions with Azure OpenAI Services 2 hr 13 min |
Among Microsoft’s Azure professional certifications illustrated by this pdf, there are three levels of AI:
AI-900 $99 Fundamentals is the entry-level exam. It’s a pre-requisite for:
AI-102 $165 Associate exam focuses on the use of pre-packaged cloud-based services for AI development. It has free re-cert after 1-year.
https://www.linkedin.com/in/alison-felix/ notice that in exercises some of the items are duplicated. For example:
Azure AI Foundry FAQ at:
Azure AI Foundry documentation is at:
Get 50% off the AI-102 if you finish Coursera’s Microsoft AI & ML Engineering Professional Certificate by Mark DiMauro at Univ. Pittsbergh. It consists of 5 courses:
There is also a Coursera course on “Developing AI Applications with Azure”.
Microsoft Developer Community Blog at https://techcommunity.microsoft.com/category/azure/blog/azuredevcommunityblog covers ALL the various tech Microsoft has (365, Bicep, etc. in Spanish, English, etc.)
https://devpost.com/software/avoid-maga-word-bans/joins/mG6Kzv8uN6hYCjW0jCt9lA
https://devpost.com/submit-to/23928-azure-ai-developer-hackathon/manage/submissions/630940-avoid-word-bans/project_details/edit DevPost Hackthon project described as “Avoid word bans”.
https://www.linkedin.com/pulse/program-find-words-banned-wilson-mar-msc-2ccbc/?trackingId=Ux0g5xgWdyTMf7EAC4hFow%3D%3D
Pamela Fox (MS Evangelist) (pamelafox.org) holds Office hours on Discord
She held in 2025 Microsoft Reactor livestream events in London, etc. on Intelligent Applications https://developer.microsoft.com/en-us/reactor/series/s-1491/
From aka.ms/PythonAI/series to https://github.com/pamelafox/python-openai-demos/discussions/21
18 March RAG Recording Slides on GumRoad https://github.com/microsoft/RAG_Hack?tab=readme-ov-file#stream-schedule
19 March, 2025 | 7:00 AM (UTC-06:00) Dynamics 365 CE - 2025 Release Topic: Intelligent Applications |
20 March, 2025 | 4:30 PM (UTC-06:00) Python + AI: Vision models |
AI Toolkit accesses FREE GitHub Models (these have limited tokens) at https://github.com/marketplace/models
https://gateway.on24.com/wcc/eh/4304051/category/142480/python-ai Python & AI resources
https://github.com/marketplace?type=models
https://github.com/Azure-Samples/raft-distillation-recipe A recipe that will walk you through using either Meta Llama 3.1 405B or GPT-4o deployed on Azure AI to generate a synthetic dataset using UC Berkeley’s Gorilla project RAFT method.
https://github.com/Azure-Samples/contoso-chat This sample has the full End2End process of creating RAG application with Prompty and Azure AI Foundry. It includes GPT 3.5 Turbo LLM application code, evaluations, deployment automation with AZD CLI, GitHub actions for evaluation and deployment and intent mapping for multiple LLM task mapping.
Previous exam 774 was retired. It was based on VIDEO: Azure Machine Learning Studio (classic) web services, which reflected “All Microsoft all the time” using proprietary “pickle” (pkl) model files. Classes referencing it are now obsolete.
The MS LEARN site refers to files in https://github.com/MicrosoftLearning/mslearn-ai900
AI-100 on June 30, 2021 with a shift from infrastructure (KeyVault, AKS, Stream Analytics) to programming C#, Python, and curl commands. (Free re-cert after 2-years).
DP-090 and this LAB goes into implementing a Machine Learning Solution with Databricks, which has its own AI certification path.
DP-100 covers development of custom models using Azure Machine Learning.
DP-203 Data Engineering on Microsoft Azure goes into how to use machine learning within Azure Synapse Analytics. It was retired on March 31, 2025.
George Chen’s Journey to Microsoft Certified: AI Engineer Associate. Videos with PPT file!
PROTIP: http://aka.ms/AIFunPath which expands to https://github.com/MicrosoftLearning/mslearn-ai-fundamentals
Exam definitions are at Microsoft’s LEARN</a> includes a free text-based tutorial called “Learning Paths” to learn skills:
Describe features of conversational AI workloads on Azure (15-20%)
Tim Warner has created several video courses on AI-900 and AI-100:
CloudSkills.io Microsoft Azure AI Fundamentals course references
https://github.com/timothywarner/ai100cs
On OReilly.com, his “Crash Course” Excel spreadsheet of exam objectives.
OReilly.com references
https://github.com/timothywarner/ai100
CloudAcademy’s 4h AI-900 video course includes lab time (1-2 hours at a time).
https://www.udemy.com/course/microsoft-ai-900/
https://www.itexams.com/info/AI-900
Emilio Melo on Linkedin Learning
Practice tests:
https://www.whizlabs.com/learn/course/designing-and-implementing-an-azure-ai-solution/
AI-102 is intended for software developers wanting to build AI-infused applications.
https://learn.microsoft.com/en-us/credentials/certifications/azure-ai-engineer
First setup development environments:
AI-102 exam, as defined at Microsoft’s LEARN has free written tutorials on each of the exam’s domains:
PROTIP: Unlike the AI-100 (which uses Python Notebooks), hands-on exercises at https://microsoftlearning.github.io/AI-102-AIEngineer/ in Microsoft’s 5-day live course AI-102T00: Designing and Implementing a Microsoft Azure AI Solution (with cloud time) consists of C# and Python programs at https://github.com/MicrosoftLearning/AI-102-AIEngineer (by Graeme Malcolm) was archived on Dec 23, 2023 after its content was distributed among these repos:
VIDEO Andrew Brown’s Azure AI Engineer Associate Certification (AI-102) – Full Course to PASS the Exam on FreeCodeCamp’s YouTube channel
https://learn.microsoft.com/en-us/training/courses/ai-102t00 modules:
On Coursera: Coursera video course: Developing AI Applications on Azure by Ronald J. Daskevich at LearnQuest is structured for 5 weeks. Coursera’s videos shows the text at each point of its videos. NOTE: It still sends people to https://notebooks.azure.com and covers Microsoft’s TDSP (Team Data Science Process) VIDEO:
Resources:
On June 30, 2021 Microsft retired the AI-100 exam in favor of AI-102 exam (avilable in $99 beta since Feb 2021). AI-100 exam, as defined at Microsoft’s LEARN has free written tutorials on each of the exam’s domains:
https://github.com/MicrosoftLearning/AI-100-Design-Implement-Azure-AISol
https://github.com/MicrosoftLearning/Principles-of-Machine-Learning-Python
Resources:
Guy Hummel’s CloudAcademy.com 7hr AI-100 video course.
Raza Salehi created on Pluralsight.com a series for Microsoft Azure AI Engineer (AI-100)
Practice tests:
AI Capabilities
https://learn.microsoft.com/en-us/training/modules/prepare-azure-ai-development/
most medium to large-scale development scenarios it’s better to provision Azure AI services resources as part of an Azure Foundry hub - enabling you to centralize access control and cost management, and making it easier to manage shared resource usage based on AI development projects. Hubs provide a top-level container for managing shared resources, data, connections and security configuration for AI application development. A hub can support multiple projects, in which developers collaborate on building a specific solution. A hub provides a centrally managed collection of shared resources and management configuration for AI solution development. You need at least one hub to use all of the solution development features and capabilities of AI Foundry.
https://learn.microsoft.com/en-us/training/modules/prepare-azure-ai-development/4-azure-ai-foundry
This document covers:
Automatically shut down Resource Groups of a Subscription by creating a Logic App.
Run an API connecting to an established endpoint (SaaS) you don’t need to setup: Bing Search.
Create Functions
Create a Workspace resource to run …
Us cognitivevision.com to Create Custom Vision for …
IMPORTANT PROTIP: As of this writing, Microsoft Azure does NOT have a full SaaS offering for every AI/ML service. You are required to create your own computer instances, and thus manage machine sizes (which is a hassle). Resources you create continue to cost money until you shut them down.
So after learning to set up the first compute service, we need to cover automation to shut them all down while you sleep.
So that you’re not tediously recreating everything everyday, this tutorial focuses on automation scripts (CLI Bash and PowerShell scripts) to create compute instances, publish results, then shut itself down. Each report run overwrites files from the previous run so you’re not constantly piling up storage costs.
Another reason for being able to rebuild is that you if you find that the pricing tier chosen is no longer suitable for your solution, you must create a new Azure Cognitive Search resource and recreate all indexes and objects.
When you use my Automation scripts at https://github.com/wilsonmar/azure-quickly/ to create resources the way you like, using “Infrastructure as Code”, so you can throw away any Subscription and begin anew quickly.
My scripts also makes use of a more secure way to store secrets than inserting them in code that can be checked back into GitHub.
Effective deletion hygiene is also good to see how your instances behave when it takes advantage of cheaper spot instances which can disappear at any time. This can also be used for “chaos engineering” efforts.
To verify resource status and to discuss with others, you still need skill at clicking through the Portal.azure.com, ML.azure.com, etc.
References:
VIDEO: To release IP address, don’t stop machines, but delete the resource.
VIDEO: shut down automatically all your existing VMs (using a PowerShell script called by a scheduled Logic App), by Frank Boucher at github.com/FBoucher
Auto-shutdown by Resource Manager on a schedule is only for VMs in DevOps
https://www.c-sharpcorner.com/article/deploy-a-google-action-on-azure/
start/stop by an Automation Account Runbook for specific tags attached to different Resource Groups: Assert: “AutoshutdownSchedule: Tuesday” run every hour. Google Translate
https://github.com/Azure/azure-sdk-for-python
Get 50% off by completing just one of these challenges:
View the marketing page at:
Prebuilt models expect a common type of form or document:
US Bank Statements
NOT:
poller = document_analysis_client.begin_analyze_document(
"prebuilt-layout", AnalyzeDocumentRequest(url_source=docUrl
))
result: AnalyzeResult = poller.result()
LAB: Exercise - Analyze a document using Azure AI Document Intelligence
You are using the prebuilt layout model to analyze a document with many checkboxes. You want to find out whether each box is checked or empty. What object should you use in the returned JSON code? Selection marks record checkboxes and radio buttons and include whether they’re selected or not. NOT Bounding boxes. NOT Confidence indicators.
PROTIP: Although Word documents are also from Microsoft, DOCX format files are NOT supported by Azure AI Document Intelligence. But PDF documents are supported. Azure AI Document Intelligence is designed to analyze scanned and photographed paper documents, not documents that are already in a digital format so you should consider using another technology to extract the data in Word documents.
Get into the Azure Marketplace to look for Azure AI Document Intelligence – an Azure service that you can use to analyze forms completed by your customers, partners, employers, or others and extract the data that they contain.
References:
In production usage, several Azure subscriptions are needed.
The above image replaced the previous version) is the Azure Landing Zones OpenAI Reference Architecture defining how to envelope OpenAI with utilities to ensure a defensive security posture. It maps how resources are integrated in a structured, consistent manner, plus ensuring governance, compliance, and security.
The diagram above is an adaptation of Microsoft’s enterprise-scale Azure Landing Zone, a part of Microsoft’s Cloud Adoption Framework (CAF).
Each Design Area has:
A. Enterprise enrollment [TF]
B. Identity and accessment [TF]
C. Management group and subscription organization
D. Management subscription
E. Connectivity subscription
F. AI Services (Landing Zone) subscription
G. Monitoring?
H. Sandbox subscription
I. Platform DevOps Team [TF]</br />
Design Areas F (the Landing Zone for AI):
To create the resources in the diagram:
If you prefer using Bicep:
If you prefer using Terraform:
Included are Private Endpoints, Network Security Groups and Web Application Firewalls.
PROTIP: AI-102 is heavy on questions about coding.
Samples (unlike examples) are a more complete, best-practices solution for each of the snippets.
PROTIP: github.com/Azure-Samples from Microsoft offers samples code to use Cognitive Services REST API by each language:
https://docs.microsoft.com/en-us/samples/azure-samples/azure-sdk-for-go-samples/azure-sdk-for-go-samples/
A complete sample app is Microsoft’ Northwinds Traders consumer ecommerce store: install
IMPORTANT: Cognitive Services SDK Samples for:
Tim Warner’s https://github.com/timothywarner/ai100 includes Powershell scripts:
Among Azure Machine Learning examples is a CLI at https://github.com/Azure/azureml-examples/tree/main/cli
PROTIP: CAUTION: Each service has a different maturity level in its documentation at azure.microsoft.com/en-us/downloads, such as SDK for Python open-sourced at github.com/azure/azure-sdk-for-python, described at docs.microsoft.com/en-us/azure/developer/python.
The hands-on steps below enables you to operate offline on a macOS laptop to run the Azure AI SDK in a Docker container using VS Code Dev Containers.
https://learn.microsoft.com/en-us/azure/ai-services/
runs from VSCode to run Python 3.10 with Docker containers within Azure AI Studio at https://ai.azure.com
Following How to start:
Work with Azure AI projects in VS Code
Install Docker Desktop, https://code.visualstudio.com/docs/devcontainers/tutorial
run the command
Dev Containers: Try a Dev Container Sample…
Install Git and configure it.
Create a folder “ai_env” from GitHub for the “Sample quickstart repo for getting started building an enterprise chat copilot in Azure AI Studio”:
FOLDER="ai_env" git clone https://github.com/azure/aistudio-copilot-sample "${FOLDER}" --depth 1 cd "${FOLDER}"
Response:
Cloning into 'ai_env'... remote: Enumerating objects: 122, done. remote: Counting objects: 100% (122/122), done. remote: Compressing objects: 100% (107/107), done. remote: Total 122 (delta 11), reused 88 (delta 8), pack-reused 0 Receiving objects: 100% (122/122), 227.58 KiB | 951.00 KiB/s, done. Resolving deltas: 100% (11/11), done.
Define provenance:
git remote add upstream https://github.com/azure/aistudio-copilot-sample git remote -v
Open with VSCode to an error:
code ai_env
Select the “Reopen in Dev Containers” button. If it doesn’t appear, open the command palette (Ctrl+Shift+P on Windows and Linux, Cmd+Shift+P on Mac) and run the Dev Containers: Reopen in Container command.
Install the Azure CLI using Homebrew (requires python@3.11?):
brew install azure-cli
Install the .NET (dotnet) CLI commands:
Add $HOME/.dotnet/tools folder to .zshrc file run at bootup.
From any folder, create a Conda environment containing the Azure AI SDK:
conda create --name ai_env python=3.10 pip yes | conda activate ai_env
Sample response:
Collecting package metadata (current_repodata.json): done Solving environment: done ==> WARNING: A newer version of conda exists. <== current version: 23.5.0 latest version: 24.1.1 Please update conda by running $ conda update -n base -c conda-forge conda Or to minimize the number of packages updated during conda update use conda install conda=24.1.1 ## Package Plan ## environment location: /Users/wilsonmar/miniconda3/envs/ai_env added / updated specs: - python=3.10 The following packages will be downloaded: package | build ---------------------------|----------------- python-3.10.13 |h00d2728_1_cpython 12.4 MB conda-forge setuptools-69.1.0 | pyhd8ed1ab_0 460 KB conda-forge ------------------------------------------------------------ Total: 12.9 MB The following NEW packages will be INSTALLED: bzip2 conda-forge/osx-64::bzip2-1.0.8-h10d778d_5 ca-certificates conda-forge/osx-64::ca-certificates-2024.2.2-h8857fd0_0 libffi conda-forge/osx-64::libffi-3.4.2-h0d85af4_5 libsqlite conda-forge/osx-64::libsqlite-3.45.1-h92b6c6a_0 libzlib conda-forge/osx-64::libzlib-1.2.13-h8a1eda9_5 ncurses conda-forge/osx-64::ncurses-6.4-h93d8f39_2 openssl conda-forge/osx-64::openssl-3.2.1-hd75f5a5_0 pip conda-forge/noarch::pip-24.0-pyhd8ed1ab_0 python conda-forge/osx-64::python-3.10.13-h00d2728_1_cpython readline conda-forge/osx-64::readline-8.2-h9e318b2_1 setuptools conda-forge/noarch::setuptools-69.1.0-pyhd8ed1ab_0 tk conda-forge/osx-64::tk-8.6.13-h1abcd95_1 tzdata conda-forge/noarch::tzdata-2024a-h0c530f3_0 wheel conda-forge/noarch::wheel-0.42.0-pyhd8ed1ab_0 xz conda-forge/osx-64::xz-5.2.6-h775f41a_0 Proceed ([y]/n)? _
Install the Azure AI CLI using the .NET (dotnet) CLI command:
dotnet tool install --global Azure.AI.CLI --prerelease
Response:
You can invoke the tool using the following command: ai Tool 'azure.ai.cli' (version '1.0.0-preview-20240216.1') was successfully installed.
Obtain Python dependency libraries defined in requirements.txt:
conda install pip pip install -r requirements.txt
Alternately, if there were instead an environment.yml.
Get the ai cli:
ai
Results with the prerelease version:
AI - Azure AI CLI, Version 1.0.0-preview-20240216.1 Copyright (c) 2024 Microsoft Corporation. All Rights Reserved. This PUBLIC PREVIEW version may change at any time. See: https://aka.ms/azure-ai-cli-public-preview ___ ____ ___ _____ / _ /_ / / _ |/_ _/ / __ |/ /_/ __ |_/ /_ /_/ |_/___/_/ |_/____/ USAGE: ai[...] HELP ai help ai help init COMMANDS ai init [...] (see: ai help init) ai config [...] (see: ai help config) ai dev [...] (see: ai help dev) ai chat [...] (see: ai help chat) ai flow [...] (see: ai help flow) ai search [...] (see: ai help search) ai speech [...] (see: ai help speech) ai service [...] (see: ai help service) EXAMPLES ai init ai chat --interactive --system @prompt.txt ai search index update --name MyIndex --files *.md ai chat --interactive --system @prompt.txt --index-name MyIndex SEE ALSO ai help examples ai help find "prompt" ai help find "prompt" --expand ai help find topics "examples" ai help list topics ai help documentation
Navigate to the folder
Initialize the ai project:
ai init
ai login
(interactive device code)”You have signed in to the Microsoft Azure Cross-platform Command Line Interface application on your device. You may now close this window.
Open an internet browser (Safari) to the page
Close the browser tab when you see
You have signed in to the Microsoft Azure Cross-platform Command Line Interface application on your device. You may now close this window.
cd to the folder
Notice that file src/copilot_aisdk/requirements.txt contains:
openai azure-identity azure-search-documents==11.4.0b6 jinja2
For “AZURE OPENAI DEPLOYMENT (CHAT)” select “(Create new)”. FIXME:
CREATE DEPLOYMENT (CHAT) Model: *** No deployable models found *** CANCELED: No deployment selected
Azure AI search resource
This generates a config.json file in the root of the repo for the SDK to use when authenticating to Azure AI services.
Alternating:
https://github.com/azure-samples/azureai-samples
Select the Node sample from the list.
File devcontainer.json is a config file that determines how your dev container gets built and started.
https://microsoftlearning.github.io/mslearn-ai-vision/ EXERCISES based on
https://github.com/MicrosoftLearning/mslearn-ai-vision
LLMOps: LLMOps Microsoft Developer videos
aka.ms/cognitivevision resolves to YouTube channel “Microsoft Mechanics” rebranded to include “agents”, such as:
In a web browser, navigate to Vision Studio:
Select a Subscription.
VIDEO: The menu of AI Vision service categories uses Microsoft’s “Florence foundational model” (new in 2023) models trained with an “open world” of billions of images combined with a large language model. That enables the identification of objects and their location in a frame such as this:
Click each, then “Try it out”:
Optical character recognition (OCR):
Select a Region (East US, West Europe, West US, or West US 2)
The “cog-ms-learn-vision” resource is created for you automatically.
https://github.com/Azure-Samples/cognitive-service-vision-model-customization-python-samples
https://microsoftlearning.github.io/mslearn-ai-fundamentals/Instructions/Labs/05-ocr.html
On the Getting started with Vision landing page, select Optical character recognition, and then the Extract text from images tile.
Under the Try It Out subheading, acknowledge the resource usage policy by reading and checking the box.
download ocr-images.zip by selecting
https://aka.ms/mslearn-ocr-images
Open the folder.
On the portal, select Browse for a file and navigate to the folder on your computer where you downloaded ocr-images.zip. Select advert.jpg and select Open.
review what is returned:
In Detected attributes, any text found in the image is organized into a hierarchical structure of regions, lines, and words.
On the image, the location of text is indicated by a bounding box, as shown here:
App for the blind: VIDEO: INTRO: SeeingAI.com. Permissions for the “See It All” app are for its internal name “Mt Studio Web Prod”.
https://docs.microsoft.com/en-us/learn/paths/explore-computer-vision-microsoft-azure
HISTORY: In 2014, Microsoft showed off its facial recognition capabilities with a website (how-old.net which now is owned by others) to guess how old someone is. At conferences they built a booth that takes a picture.
LEARN: https://docs.microsoft.com/en-us/learn/modules/read-text-computer-vision/
DEMO: Seeing AI app talking camera narrates the world around blind people.
Image analysis
Computer Vision” analyzes images and video to extract descriptions, tags, objects, and text. API Reference, DOCS, INTRO:
https://docs.microsoft.com/en-us/azure/cognitive-services/computer-vision/quickstarts-sdk/client-library?tabs=visual-studio&pivots=programming-language-csharp READ
Select images and review the information returned by the Azure Computer Vision web service:
DEMO: https://aidemos.microsoft.com/computer-vision
Click an image to see results of “Analyze and describe images”. Objects are returned with a bounding box to indicate their location within the image.
Additionally, the Computer Vision service can:
Computer Vision API shows all the features.
X: On another browser tab, view the repo (faster):
Follow the instructions in the notebook to create a resource, etc.
TODO: Incorporate the code and put it in a pipeline that minimizes manual actions.
Azure Custom Vision trains custom models referencing custom (your own) images. Custom vision has two project types:
Image classification is a machine-learning based form of computer vision in which a model is trained to categorize images based on their (class or) primary subject matter they contain.
Object detection goes further than classification to identify the “class” of individual objects within the image, and to return the coordinates of a bounding box that indicates the object’s location.
HANDS-ON: LEARN hands-on lab
LAB: Steps:
Perform object detection to locate elements within an image and return a bounding box.
Open
DOCS:
MS LEARN HANDS-ON LAB:
aka.ms/learn-image-classification which redirects to
docs.microsoft.com/en-us/learn/modules/classify-images-custom-vision
Load the code from:
https://github.com/MicrosoftLearning/mslearn-ai900/blob/main/03%20-%20Object%20Detection.ipynb
https://docs.microsoft.com/en-us/learn/modules/evaluate-requirements-for-custom-computer-vision-api/3-investigate-service-authorization Custom Vision APIs use two subscription keys, each control access to an API:
https://docs.microsoft.com/en-us/learn/modules/evaluate-requirements-for-custom-computer-vision-api/4-examine-the-custom-vision-prediction-api
References: CV API
Azure “Face” is used to build face detection and facial recognition solutions in five categories:
NOTE: On June 11, 2020, Microsoft announced that it will not sell facial recognition technology to police departments in the United States until strong regulation, grounded in human rights, has been enacted. As such, customers may not use facial recognition features or functionality included in Azure Services, such as Face or Video Indexer, if a customer is, or is allowing use of such services by or for, a police department in the United States.
HANDS-ON: LEARN tutorial using these Lab files.
A face location is face coordinates – a rectangular pixel area in the image where a face has been identified.
The Face API can return up to 27 landmarks for each identified face that you can use for analysis. Azure allows a person to can have up to 248 faces. There is a 6 MB limit on the size of each file (jpeg, png, gif, bmp).
Face attributes are predefined properties of a face or a person represented by a face. The Face API can optionally identify and return the following types of attributes for a detected face:
Emotions detected in JSON response is a floating point number:
PROTIP: “Happiness: 9.99983543,” is near certainty at 1.0. 2.80234E-08” indicates 8
https://github.com/Azure-Samples/cognitive-services-FaceAPIEnrollmentSample
DEMO: LAB: https://github.com/microsoft/hackwithazure/tree/master/workshops/web-ai-happy-sad-angry
https://learn.microsoft.com/en-us/azure/ai-services/computer-vision/quickstarts-sdk/identity-client-library?tabs=windows%2Cvisual-studio&pivots=programming-language-csharp
Subscribe to the Face API:
Enter a unique name for your Face API subscription name in variable MY_FACE_ACCT
Paste in setme.sh
export MY_FACE_ACCT=faceme
Click “Create” to subscribe to the Face API.
In “Keys and Endpoint”, copy Key1 and paste in setme.sh
export MY_FACE_KEY1=subscription_key
The endpoint used to make REST calls is “$MY_FACE_ACCT.cognitiveservices.azure.com/”
https://github.com/wilsonmar/azure-quickly/blob/main/az-face-init.sh
https://docs.microsoft.com/en-us/learn/modules/identify-faces-with-computer-vision/8-test-face-detection?pivots=csharp
References:
https://docs.microsoft.com/en-us/azure/cognitive-services/Face/Overview What is the Azure Face service?
https://docs.microsoft.com/en-us/azure/cognitive-services/Face/
https://docs.microsoft.com/en-us/azure/cognitive-services/Face/quickstarts/client-libraries?tabs=visual-studio&pivots=programming-language-csharp
https://github.com/MicrosoftLearning/mslearn-ai900/blob/main/04%20-%20Face%20Analysis.ipynb
Media Services & Storage Account:
In Portal, Media Services blade.
Specify Resource Group, account name, storage account, System-managed identity.
In a browser, go to the Video Indexer Portal URL:
NOTE: Video Indexer is under Media Services rather than Cognitive Services.
Click the provider to login: AAD (Entra) account, Personal Microsoft account, Google.
PROTIP: Avoid using Google due to the permissions you’re asked to give:
Say Yes to Video Indexer permission to: Access your email addresses & View your profile info and contact list, including your name, gender, display picture, contacts, and friends.
NOTE: You’ll get an email with subject “Your subscription to the Video Indexer API”.
On your mobile phone you’ll get a “Connected to new app” notice for Microsoft Authenticator.
Click “Account settings”. PRICING: up to 10 hours (600 minutes) of free indexing to website users and up to 40 hours (2,400 minutes) of free indexing to API users. Media reserved units are pre-paid. See FAQ
Switch to the file which defines Azure environment variable VIDEO_INDEXER_ACCOUNT (in setmem.sh) as described in
Switch to the file which defines Azure environment variable VIDEO_INDEXER_ACCOUNT (in setmem.sh) as described in
Switch back.
Go to the “Azure Video Analyzer for Media Developer Portal”:DOCS
Click “Sign In”. Click “Profile”
NOTE: The UI has changed since publication of Microsoft’s tutorial, which says “Go to the Products tab, then select Authorization.”
Switch to the file which defines Azure environment variable VIDEO_INDEXER_API_KEY (in setmem.sh) as described in
https://bomonike.github.io/azure-quickly
Switch back.
https://api-portal.videoindexer.ai/api-details#api=Operations&operation=Get-Account-Access-Token
DOCS: Select the Azure Video Indexer option for uploading videos: upload from URL (there is also send file as byte array by an API call, which has limits of 2 GB in size and a 30-minute timeout.
az-video-upload.py in https://bomonike.github.io/azure-quickly
Make an additional call to retrieve insights.
Reference existing asset ID
In “Media files” at https://www.videoindexer.ai/media/library
Click “Samples”, and click on a video file to Play to see the media’s people, topics (keywords).
NOTE: Search results include exact start times where an insight exists, possibly multiple matches for the same video if multiple segments are matched.
Click a tag to see where it was mentioned in the timeline.
Alternately, use the API to search: ???
Each video consists of scenes grouping shots, which each contain keyframes.
A scene represents a single event within the video. It groups consecutive shots that are related. It will have a start time, end time, and thumbnail (first keyframe in the scene).
A shot represents a continuous segment of the video. Transitions within the video are detected which determine how it is split into shots. Shots have a start time, end time, and list of keyframes.
Keyframes are frames that represent the shot. Each one is for a specific point in time. There can be gaps in time between keyframes but together they are representative of the shot. Each keyframe can be downloaded as a high-resolution image.
Click “Model customizations”
https://github.com/Azure-Samples/media-services-video-indexer
https://dev.to/adbertram/getting-started-with-azure-video-indexer-and-powershell-3i32
“Form Recognizer” extracts information from images obtained from scanned forms and invoices.
Resources:
https://github.com/MicrosoftLearning/mslearn-ai900/blob/main/06%20-%20Receipts%20with%20Form%20Recognizer.ipynb
https://docs.microsoft.com/en-us/samples/azure/azure-sdk-for-python/tables-samples/
https://docs.microsoft.com/en-us/samples/azure/azure-sdk-for-java/formrecognizer-java-samples/
https://docs.microsoft.com/en-us/samples/azure/azure-sdk-for-net/azure-form-recognizer-client-sdk-samples/
https://docs.microsoft.com/en-us/samples/azure/azure-sdk-for-python/formrecognizer-samples/
https://github.com/MicrosoftLearning/mslearn-ai900/blob/main/05%20-%20Optical%20Character%20Recognition.ipynb
Image classification - https://github.com/MicrosoftLearning/mslearn-ai900/blob/main/01%20-%20Image%20Analysis%20with%20Computer%20Vision.ipynb
Object detection - https://github.com/MicrosoftLearning/mslearn-ai900/blob/main/02%20-%20Image%20Classification.ipynb
Ink converts handwriting to plain text, in 63+ core languages.
It was deprecated on 31 January 2021.
https://docs.microsoft.com/en-us/azure/cognitive-services/ink-recognizer/quickstarts/csharp Quickstart: Recognize digital ink with the Ink Recognizer REST API and C#
QUESTION: Does it integrate with a tablet?
Azure Machine Learning 2.0 CLI (preview) examples
https://github.com/Azure-Samples/Cognitive-Services-Vision-Solution-Templates
BTW https://docs.microsoft.com/en-us/samples/azure-samples/cognitive-services-quickstart-code/cognitive-services-quickstart-code/ https://github.com/Azure-Samples/cognitive-services-sample-data-files
My script does the same as these manual steps:
Click the blue icon to the right of KEY 1 heading to copy it to your invisible Clipboard.
Endpoint: https://tot.cognitiveservices.azure.com/
TODO: DOCS: Automate above steps to create compute and server startpup script.
PROTIP: These instructions are not in Microsoft LEARN’s tutorial.
“Your document is currently not connected to a compute. Switch to a running compute or create a new compute to run a cell.”
You would save money if you don’t leave servers running, racking up charges.
You can confidently delete Resource Groups and all resources attached if you have automation in CLI scripts that enable you to easily create them later.
Instead of the manual steps defined in this LAB, run my Bash script in CLI, as defined by this DOC:
G+\ Cognitive Services.
TODO: Instead of putting plain text of cog_key in code, reference Azure Vault. Have the code in GitHub.
Azure has a cognitiveservices CLI subcommand.
https://docs.audd.io/?ref=public-apis
Formerly called “NLP” (Natural Language Processing), Intro: Tutorial: https://docs.microsoft.com/en-us/learn/paths/explore-natural-language-processing
NLP enables the creation of software that can:
Within Microoft, NLP consists of these Azure services (described below):
https://microsoftlearning.github.io/mslearn-ai-language/
Speaker Recognition for authentication.
In contrast, Speaker Diarization groups segments of audio by speaker in a batch operation.
“In cloudinary”
Think of “LUIS” as Amazon Alexa’s frienemy.
https://www.luis.ai, provides examples of how to use LUIS (Language Understanding Intelligent Service) thus:
A machine learning-based service to build natural language into apps, bots, and IoT devices. Quickly create enterprise-ready, custom models that continuously improve.
Bot Framework Emulator Follow the instructions at https://github.com/Microsoft/BotFramework-Emulator/blob/master/README.md to download and install the latest stable version of the Bot Framework Emulator for your operating system.
Bot Framework Composer Install from https://docs.microsoft.com/en-us/composer/install-composer.
Utterances are input from the user that your app needs to interpret.
https://github.com/Azure-Samples/cognitive-services-language-understanding
CHALLENGE: Add natural language capabilities to a picture-management bot.
https://www.slideshare.net/goelles/sharepoint-saturday-belgium-2019-unite-your-modern-workplace-with-microsofsts-ai-ecosystem
Perhaps for less latency, create the LUIS app in the same geographic location where you created the service:
NOTE: LUIS is not a service like “Cognitive Services”, but a Marketplace item:
Enter a unique name for your LUIS service. This is a public sub-domain to a LUIS Regional websites (above)
PROTIP: Different localities have different costs.
If you are making use of my framework, open the environment variables file:
code ../setmem.sh
Locate the variables and replace the sample values:
export MY_LUIS_AUTHORING_KEY=”abcdef1234567896b87f122281e9187e” export MY_LUIS_ENDPOINT=””
SECURITY PROTIP: Putting key values in a global variables definition file separate from the code.
1.
authoring_key = env:MY_LUIS_AUTHORING_KEY authoring_endpoint = env:MY_LUIS_ENDPOINT
Create a new authoring service (Azure service).
Now at https://www.luis.ai/applications, notice at the upper-right “PictureBotLUIS (westus, F0)” where the Directory name usually appears in the Portal.
The pages displayed will be different if you have already created a LUIS app or have no apps created at all. Select either Go to apps, or the apps option that is available on your initial LUIS page.
Select “+ New app” for conversation Apps for the “Create new app” pop-up dialog.
The LUIS user interface is updated on a regular basis and the actual options may change, in terms of the text used. The basic workflow is the same but you may need to adapt to the UI changes for the text on some elements or instructions given here.
Take note of the other options such as the ability to import JSON or LU file that contains LUIS configuration options.
Give your LUIS app a name, for example, PictureBotLUIS.
For Culture, type “en-us” then select the appropriate choice for you language.
Select Done.
Dismiss the guidance dialog that may display.
DEFINITION: Each bot action has an intent invoked by an utterance, which gets processed by all models. The top scoring model LUIS selects as the prediction.
An utterance that don’t map to existing intents is called the catchall intent “None”.
Intents with significantly more positive examples (“data imbalance” toward that intent) are more likely to receive positive predictions.
On your local machine (laptop), install Visual Studio Code for your operating system.
If you will be completing your coding with Python, ensure you have a Python environment installed locally. Once you have Python installed, you will need to install the extension for VS Code.
Alternately, to use C# as your code language, start by installing the latest .NET Core package for your platform. You can choose Windows, Linux, or macOS from the drop-down on this page. Once you have .NET Core installed, you will need to add the C# Extension to VS Code. Select the Extensions option in the left nav pane, or press CTRL+SHIFT+X and enter C# in the search dialog.
Create a folder within your local drive project folder to store the project files. Name the folder “LU_Python”. Open Visual Studio Code and Open the folder you just created. Create a Python file called “create_luis.py”.
cd cd projects mkdir LU_Python cd LU_Python touch create_luis.py code .
Install the LUIS package to gain access to the SDK.
sudo pip install azure-cognitiveservices-language-luis
Open an editor (Visual Studio Code)
Select Machine learned for Type. Then select Create.
https://docs.microsoft.com/en-us/learn/modules/manage-language-understanding-intelligent-service-apps/2-manage-authoring-runtime-keys
Select Azure Resources in the left tool bar.
Unless you have already created a prediction key, your screen should look similar to this. The key information is obscured on purpose.
A Starter Key provides 1000 prediction endpoint requests per month for free.
Examples of utterances are on the Review endpoint utterances page on the Build tab.
docker pull mcr.microsoft.com/azure-cognitive-services/luis:latest
Specify values for Billing LUIS authoring endpoint URI and ApiKey:
docker run --rm -it -p 5000:5000 --memory 4g --cpus 2 --mount type=bind,src=c:\input,target=/input --mount type=bind,src=c:\output\,target=/output mcr.microsoft.com/azure-cognitive-services/luis Eula=accept Billing={ENDPOINT_URI} ApiKey={API_KEY}
”–mount type=bind,src=c:\output\,target=/output” indicates where the LUIS app saves log files to the output directory. The log files contain the phrases entered when users hit the endpoint with queries.
Get the AppID from your LUIS portal and paste it in the placeholder in the command.
curl -G \ -d verbose=false \ -d log=true \ --data-urlencode "query=Can I get a 5x7 of this image?" \ "http://localhost:5000/luis/v3.0/apps/{APP_ID}/slots/production/predict"
Get the AppID from your LUIS portal and paste it in the placeholder in the command.
curl -X GET \ "http://localhost:5000/luis/v2.0/apps/{APP_ID}?q=can%20I%20get%20an%20a%205x7%20of%20this%20image&staging=false&timezoneOffset=0&verbose=false&log=true" \ -H "accept: application/json"
DEMO: voice control lighting in a virtual home.
Select suggested utterances to see the JSON response:
Type instructions, use the microphone button to speak commands.
LUIS identifies from your utterance your intents and entities.
Entity types
Run my az-luis-cli.sh.
DOCS:
https://github.com/cloudacademy/using-the-azure-machine-learning-sdk
Create a Resource referenced when LUIS Authoring is defined. The resource name should be lower case as it is used for the endpoint URL, such as:
https://luis-resource-name.cognitiveservices.azure.com/
The Bot Framework CLI requires Node.js.
npm i -g npm
https://github.com/Microsoft/BotFramework-Emulator/blob/master/README.md
Download to downloads folder at:
https://aka.ms/bf-composer-download-mac
https://docs.microsoft.com/en-us/composer/install-composer
On a Mac, download the .dmg
https://aka.ms/bf-composer-download-mac
Use Node.js to install the latest version of Bot Framework CLI from the command line:
npm i -g @microsoft/botframework-cli
https://docs.microsoft.com/en-us/azure/bot-service/bot-builder-howto-bf-cli-deploy-luis?view=azure-bot-service-4.0
LUIS can generate from models a TypeScript or C# typw (program code).
DEMO: https://www.luis.ai
Sign in and create an authoring resource refercening the Resource Group.
PROTIP: In the list of Cognitive services kinds, resources and subscription keys created for LUIS authoring are separate than ones for prediction runs so that utilization for the two can be tracked separately.
https://aka.ms/AI900/Lab4 which redirects to
Create a language model with Language Understanding which trains a (LUIS) language model that can understand spoken or text-based commands. He’s Alexa’s boyfriend, ha ha.
Process Natural Lanaguage using Azure Cognitive Language Services
https://github.com/MicrosoftLearning/AI-102-LUIS contains image files for reference by https://github.com/MicrosoftLearning/AI-102-Code-Repos https://github.com/MicrosoftLearning/AI-102-Process-Speech
PROTIP: LUIS does not perform text summarization. That’s done by another service in the pipeline.
References:
Adding Language Understanding to Chatbots With LUIS by Emilio Meira
NOTE: The previous version references interactive Python Notebooks such as MS LEARN HANDS-ON LAB referencing “07 - Text Analytics.ipynb”.
Look at the DEMO GUI at:
Click “Next Step” through the various processing on a sentence:
Key phrase extraction
Bing Entity Search
API:
Some Text Analytics API services are synchronous and asynchronous
Cloud Academy lab “Using Text Analytics in the Azure Cognitive Services API”
docs.microsoft.com/en-us/samples/azure/azure-sdk-for-python/textanalytics-samples
Azure Text Analytics client library samples for JavaScript has step-by-step instructions:
Logging into the Microsoft Azure Portal
<a target=”_blank” href=”https://portal.azure.com/#blade/HubsExtension/BrowseResource/resourceType/Microsoft.CognitiveServices%2Faccounts”“>Cognitive Services</a>
Select the service already defined for you.
At the left menu click “Keys and Endpoint”. The Endpoint URL contains:
https://southcentralus.api.cognitive.microsoft.com/
The Endpoint is the location you’ll be able to make requests to in order to interact with the Cognitive Services API. The Key1 value is the key that will allow you to authenticate with the API. Without the Key1 value, you will receive unauthenticated errors.
In the Azure Portal, type Function App into the search bar and click Function App:
alt
alt
You’ll be brought to the Function App blade. While a complete summary of function apps is outside the scope of this lab, you should know the function apps allow you to create custom functions using a variety of programming languages, and to trigger them using any number of events.
In the case of this lab step, you’ll set up a function app to interact with the Cognitive Services API using Node.js. You’ll then configure the function to be triggered by visiting its URL in a browser tab.
On the left side of the blade, click on Functions and then + Add.
Click the HTTP Trigger option:
alt
Important: If you don’t see the HTTP trigger option, click More Templates and then Finish and view templates to see the correct option.
alt
Click Code + Test in the left sidebar then look at the upper part of the console and switch to the function.json file:
Replace the contents of the file with the following snippet and click Save:
{ "bindings": [ { "authLevel": "function", "type": "httpTrigger", "direction": "in", "name": "req", "methods": [ "get", "post" ] }, { "type": "http", "direction": "out", "name": "$return" } ], "disabled": false }
The function.json file manages the behavior of your function app.
Switch to the index.js file:
Replace the contents of the file with the following code:
'use strict'; let https = require('https'); const subscription_key = "INSERT_YOUR_KEY_HERE"; const endpoint = "INSERT_YOUR_ENDPOINT_HERE"; const path = '/text/analytics/v2.1/languages'; module.exports = async function (context, req) { let documents = { 'documents': [ {'id': '1', 'text': 'This is a document written in English.'}, {'id': '2', 'text': 'Je suis une phrase écrite en français.'}, {'id': '3', 'text': 'Este es un documento escrito en español.'}, ] }; let body = JSON.stringify(documents); let request_params = { method: 'POST', hostname: (new URL(endpoint)).hostname, path: path, headers: { 'Ocp-Apim-Subscription-Key': subscription_key, } }; let response = await makeRequest(request_params, body); context.res = { body: response } };
function makeRequest(options, data) { return new Promise((resolve, reject) => { const req = https.request(options, (res) => { res.setEncoding('utf8'); let responseBody = ''; res.on('data', (chunk) => { responseBody += chunk; }); res.on('end', () => { resolve(JSON.parse(responseBody)); }); }); req.on('error', (err) => { reject(err); }); req.write(data) req.end(); }); }
Copy code 1 2 const subscription_key = “4779caba69344c3d97bca9863d726af6”; const endpoint = “https://southcentralus.api.cognitive.microsoft.com/”;
Click Save.
While a deep understanding of code is outside the scope of this lab, you should take note of a couple of things:
The endpoint and subscription key that you set will allow the function app to communicate with your Cognitive Services API. The endpoint tells the app where to find the API, and the subscription key allows the app to authenticate. The path variable declares the exact path the app should request on the Cognitive Services API. This path system allows the API to offer many different services in one endpoint: Copy code 1 const path = ‘/text/analytics/v2.1/languages’;
The documents variable declares a list of documents that will be passed to the language detection API. The API will return the languages that these documents are most likely written in Currently there are three documents, and the goal will be to get the language used in each one:
let documents = { 'documents': [ {'id': '1', 'text': 'This is a document written in English.'}, {'id': '2', 'text': 'Je suis une phrase écrite en français.'}, {'id': '3', 'text': 'Este es un documento escrito en español.'}, ] };
The rest of the code simply manages the formatting of the data, requesting the language detection service from the API, and returning the result. Next, you’ll visit the function app’s URL to see it work.
Look at the upper command bar and click Test/Run.
Select GET as Http Method before clicking Run:
View the output in the Output tab:
alt
Notice that what’s returned is a JSON object with a “documents” object containing three results, one for each language you submitted to it. Each result has a “name” value with the predicted language and a “score” value with the likelihood that the decision is accurate. A score of 1 means that the Cognitive Services API was completely confident in its language detection. Here’s what the formatted JSON object looks like, for reference:
{ "documents": [ { "id": "1", "detectedLanguages": [ { "name": "English", "iso6391Name": "en", "score": 1 } ] }, { "id": "2", "detectedLanguages": [ { "name": "French", "iso6391Name": "fr", "score": 1 } ] }, { "id": "3", "detectedLanguages": [ { "name": "Spanish", "iso6391Name": "es", "score": 1 } ] } ], "errors": [] }
The output is a number from 0 to 1, with 1 being the most positive language and zero being the most negative opinion expressed.
Named Entity Recognition (NER) identifies entities in the text and group them into different entity categories, such as organization name, location, event, etc.
"documents": [ { "id": "1", "keyPhrases": [ "world", "input text" ] },
vs. Content Moderator - Moderate Detect Language to Auto Correct, PII, listid, classify, language. Classification of Profanity returns JSON with several categories:
https://learn.microsoft.com/en-us/training/modules/analyze-text-ai-language/3-detect-language
Input text is in a formatted JSON document files of up to 5,120 characters. Each file can contain up to 1,000 id’s, each associated with a text string.
As with GraphQL, the API returns the detected language and a numeric score between 0 and 1. Scores close to 1 indicate 100% certainty that the identified language is true. A total of 120 languages are supported.
"id": "3", "detectedLanguages": [ { "name": "Spanish", "iso6391Name": "es", "score": 1 }
Only one language code is returned for each document submitted. Mixed language content within the same document returns the language with the largest representation in the content, but with a lower positive rating, reflecting the marginal strength of that assessment.
Sample Response:
{ "documents": [{ "id": "1", "entities": [{ "name": "Seattle", "matches": [{ "wikipediaScore": 0.15046201222847677, "entityTypeScore": 0.80624294281005859, "text": "Seattle", "offset": 26, "length": 7 }], "wikipediaLanguage": "en", "wikipediaId": "Seattle", "wikipediaUrl": "https://en.wikipedia.org/wiki/Seattle", "bingId": "5fbba6b8-85e1-4d41-9444-d9055436e473", "type": "Location" }, { "name": "last week", "matches": [{ "entityTypeScore": 0.8, "text": "last week", "offset": 34, "length": 9 }], "type": "DateTime", "subType": "DateRange" }] }], "errors": [] }
Microsoft has incorporated Immersive Reader throughout their products. Here is a good deep dive about what features are currently available on each product and device type:
Speech-to-text (STT) has two different REST APIs:
Speech-to-text REST API v3.0 is used for Batch transcription and Custom Speech.
Speech-to-text REST API for short audio is used for online transcription as an alternative to the Speech SDK. Requests using this API can transmit only up to 60 seconds of audio per request.
Microsoft Cognitive Services Speech SDK Samples
LAB:
https://docs.microsoft.com/en-us/azure/cognitive-services/speech-service/
https://docs.microsoft.com/en-us/learn/modules/transcribe-speech-input-text/?WT.mc_id=cloudskillschallenge_efc530c5-7105-4c12-8eb3-bc20ae3bee78
Transcriptions can be done in real-time or in batch mode.
Batch mode is when audio recordings are stored on a file share, and a shared access signature (SAS) URI is used by a program to asynchronously receive transcription results.
Take the introductory tutorial:
Introduction to Machine Learning with Hands-On Labs
https://azure.microsoft.com/en-us/documentation/articles/machine-learning-studio-overview-diagram
Create a model.
Prepare Data:
As per this video using
Train the model
Score and test the model.
Make predictions with Elastic APIs
https://github.com/timothywarner/ai100/tree/master/Speech-to-Text
Get the monthly subscription mobile app on iPhone, Android, or Amazon. It has a Phrasebook of common phrases.
DEMO: Speech Translation recognizes and synthesizes speech, and translates spoken languages. REMEMBER: The sequence of services involves two APIs:
Speech-to-Text API -> Speech Correction -> Machine Translation -> Text-to-Speech API
“Speech Recognition” and Text Analysis are not involved in this use case.
Telephone voice menus use “Speech Synthesis”, defined by the Speech Synthesis Markup Language (SSML).
https://github.com/MicrosoftLearning/mslearn-ai900/blob/main/08%20-%20Speech.ipynb
https://github.com/timothywarner/ai100/tree/master/Speech-to-Text
https://github.com/MicrosoftLearning/AI-SpeechToText
The speech-to-text service includes multiple pre-defined voices with support for multiple languages and regional pronunciations, with language detection. In addition to standard voices, neural voices leverage neural networks to overcome common limitations in speech synthesis with regard to intonation, resulting in a more natural sounding voice.
PROTIP: Neural voices are created from samples that use a 24 khz sample rate.
Speech recognition can use a acoustic model of phonemes (sounds) or a language model that matches phonemes with words.
Custom voices can be created with the text-to-speech API.
PROTIP: Since you have to use your own subscription to follow this tutorial from Microsoft, skip clicking “Launch VM mode” and follow the Python notebook on Speech on the regular Portal.
SAMPLE: https://docs.microsoft.com/en-us/samples/azure-samples/cognitive-speech-tts/azure-cognitive-tts-samples/
PROTIP: In a CLI window, run my Bash shell script to Create a Cognitive Services resource and get its two keys:
cd ~/clouddrive/azure-quickly git pull ./az-cog-cli.sh
To synthesize speech, the system typically tokenizes the text to break it down into individual words, and assigns phonetic sounds to each word. It then breaks the phonetic transcription into prosodic units (such as phrases, clauses, or sentences) to create phonemes that will be converted to audio format. These phonemes are then synthesized as audio by applying a voice, which will determine parameters such as pitch and timbre; and generating an audio wave form that can be output to a speaker or written to a file.
To specify that the speech input to be transcribed to text is in an audio file, use AudioConfig.
Change the voice used in speech synthesis by setting the SpeechSynthesisVoiceName property of the SpeechConfig object to the desired voice name.
Text (text-to-text aka TTT)
Microsoft’s Translator service can translate text between more than 90 languages and dialects (including Klingon in Star Trek), specified using ISO 639-1 two-letter language codes and 3166-1 cultural codes such as “en-US” for US English, “en-GB” for British English, “fr-CA” for Canadian French, etc.
Hands-on tool without a compute instance:
Click on “Start conversation”, log in and enter your name and language.
Share the conversation code with other participants, who can join using the Micreosoft Translator app or website.
Speak or type in your language to communicate with other participants in the conversation. Other participants will see your messages in their own language.
https://docs.microsoft.com/en-us/azure/cognitive-services/translator/custom-translator/overview What is Custom Translator?
Foe Parallel Data, equivalent documents in different languages:
https://docs.microsoft.com/en-us/azure/cognitive-services/translator/custom-translator/how-to-upload-document
LEARN: Translate text with Azure AI Translator service
BLAH: You are asked to use your own Subscription anyway, so instead of the Exercise - Translate text and speech, use portal.azure.com directly.
A Python program can run from your laptop or mobile phone making API calls to the Translator endpoint at:
https://api.cognitive.microsofttranslator.com/translate?api-version=3.0
VIDEO; Raza Salehi’s 1 hr video course “Build a Translator system”.
VIDEO intro with sample code at https://github.com/microsoft/text-analytics-walkthrough
For response “script” : “Latn”, text was transliterated in English.
A custom translator is needed to train a model to recognize and translate domain-specific words and phrases in specific industries such as aerospace, automotive, chemistry, mechanical, etc.
portal.customtranslator.azure.ai
Training is done by have pairs of documents (English and French, etc.).
10,000 aligned parallel sentences are neede to train a translator.
In addition to Microsoft Office formats, files with extension .ALIGN for parellel languages are perfectly aligned. Translation Memory systems can export parallel documents in XLF, XLIFF, TMX, suffix. Microsoft’s LocStudio files have .LCL suffix.
Translation runs can each take several hours. So batch processing is supported.
If you don’t have admin
Need admin approval Mt Studio Web Prod Mt Studio Web Prod needs permission to access resources in your organization that only an admin can grant. Please ask an admin to grant permission to this app before you can use it.
References:
https://docs.microsoft.com/en-us/samples/azure/azure-sdk-for-python/documenttranslation-samples/
HISTORY: In 2015, Microsoft unleashed the Tay chat bot, then had to bring it down after hackers submitted enough racial slurs that they fooled the system into thinking that was normal and acceptable.
HISTORY: XiaoIce, a chatbot Microsoft launched in China, “has more than 200 million users, has engaged in 30 billion conversations, and has an average conversation length of 23 turns, which averages out to about half an hour, achieving human parity at translation from Chinese to English. Japan-based Rinna and the US-based Zo)
A Bot Framework enables the creation of Virtual Assistant
A LUIS app creates these types of entities:
https://docs.microsoft.com/en-us/azure/cognitive-services/qnamaker/concepts/plan?tabs=v1
Bots are extended by Skills
The cognitive service name “QnA Maker” (Question and Answer Maker)</a> is a cloud-based API service that lets you create a conversational question-and-answer layer over your existing data. The service enables the building of knowledge bases of questions and answers that form the basis of a dialog between a human and an AI agent.
Microsoft created the QnA Maker portal to make it easier than writing code to create and manage knowledge bases using the QnA Maker REST API or SDK.
The knowledge base gets smarter as it continually learns from user behavior.
The knowledge base can be built by extracting questions and answers from your semi-structured content, including FAQs, manuals, and documents.
QnA Maker limits control the size of Knowledge base.
### Create QnA Service
https://github.com/Microsoft/BotBuilder-CognitiveServices/tree/master/CSharp/Samples/QnAMaker is only for C#.
View the DOCS:
View the v2 (previous release)
The Jupyter notebook:
https://github.com/MicrosoftLearning/mslearn-ai900/blob/main/11%20-%20QnA%20Bot.ipynb
Go to the QnA Maker portal at:
Sign in.
https://docs.microsoft.com/en-us/learn/paths/explore-conversational-ai/
Basics: During testing, do NOT click the checkbox for “Managed”. In prod, telemetry and compute are included automatically with your QnA Maker resource. If you do not select managed, you will be prompted to create an App Insights and App Service resources for the required telemetry and compute that you will have to manage for your QnA Maker resource. Read more <a target=”_blank” href=”“https://aka.ms/qnamaker-createoptions-description”>here</a>.
Resource group location: “(US) West US”
App Service details - for runtime :
Note: If you have already provisioned a free-tier QnA Maker or Azure Search resources, your quota may not allow you to create another one. In which case, select a tier other than F0 / F.
App Service details - for runtime :
Website location: Same as Azure Search location
App insights details - for telemetry and chat logs :
App insights: Disable, which will hide the “App insights location”, but appear in Review.
While you wait for the dots to stop flashing “Deployment in progress”, return to the QnA Maker portal tab. You may have timed out.
When “Your deployment is complete”, click “Go to resources” for “Congratulations! Your keys are ready.”
STEP 2: Connect your QnA service to your KB.
Azure QnA service: The QnA service resource you created in the previous step
NOTE: In the Preview there is a checkbox “Enable language setting per knowledge base”.
Language: English
STEP 3: Name your KB.
Type a name: For example: “Margie’s Travel KB”. Spaces are allowed?
STEP 4: Populate your KB.
Copy and paste this example URL:
https://github.com/MicrosoftDocs/ai-fundamentals/raw/master/data/qna_bot/margies_faq.docx
Add file
QUESTION: What is the range of popularity?
QUESTION: Extraction? I’m stuck here.
Do NOT check “Enable multi-turn extraction from URLs, .pdf or .docx files.”
Click “Create your KB”. Wait for a minute or so while your Knowledge base is created.
Review the questions and answers that have been imported from the FAQ document and the professional chit-chat pre-defined responses.
https://go.microsoft.com/fwlink/?linkid=2100125
https://go.microsoft.com/fwlink/?linkid=2100213 Coding</a>
Anomaly Detector will be retired on 10/1/2026.
https://docs.microsoft.com/en-us/samples/azure/azure-sdk-for-net/azure-anomaly-detector-client-sdk-samples/
Among Anomaly Detector API Samples
Anomaly Detector identifies potential problems early on.
https://docs.microsoft.com/en-us/learn/modules/get-started-ai-fundamentals/3-understand-anomaly-detection
Resources:
Content Moderator services detect potentially offensive or unwanted content.
This has been deprecated.
Evaluate, Find Faces, Match, OCR.
https://www.youtube.com/watch?v=gVFiA6ZQNAw
https://docs.microsoft.com/azure/cognitive-services/Content-Moderator/overview?WT.mc_id=Portal-Microsoft_Azure_Support#data-privacy-and-security
https://docs.microsoft.com/en-us/azure/cognitive-services/content-moderator/client-libraries?tabs=visual-studio&pivots=programming-language-csharp
Response from the Text Moderation API include:
Content Moderation (Evaluate, Find Faces, Match, OCR)
“Create Custom Moderator”
Oops!
Could not create the marketplace item
This marketplace item is not available.
When working:
Select Create.
Enter a unique name for your resource, select a subscription, and select a location close to you.
Select the pricing tier for this resource, and then select F0.
Create a new resource group
Metrics Advisor monitors metrics and diagnoses issues.
https://docs.microsoft.com/en-us/samples/azure/azure-sdk-for-python/metricsadvisor-samples/
Personalizer creates rich, personalized experiences for every user.
Knowledge Mining Solution Accelerator
HANDS-ON: Tutorial references these labfiles
Get on the Bing Resource portal GUI.
Define a Resource Group.
Price Tier: Free
PROTIP: Autosuggest requires the “S2” (Standard) pricing tier. Spell Check requires either S1 or S2.
Create
Notice the service name at the upper left is “Microsoft.BingSearch” and has a Global location. It’s Endpoint is: https://api.bing.microsoft.com/
The Azure Cognitive Search service uses a Cognitive Search resource to support AI-powered search and knowledge mining solutions such as:
https://blog.api.rakuten.net/top-10-best-search-apis/
“Document cracking” during indexing extracts text content from unstructured text or non-text content (such as images, scanned documents, or JPEG files). The indexer accesses an Azure data storage service.
https://blog.scottlowe.org/2019/03/01/advanced-ami-filtering-with-jmespath/
provides a platform for creating, publishing, and managing bots. Developers can use the Bot Framework to create a bot and manage it with Azure Bot Service - integrating back-end services like QnA Maker and LUIS, and connecting to channels for web chat, email, Microsoft Teams, and others.
Microsoft Bot Framework supports two approaches to integrate bots with agent engagement platforms such as Customer support service:
DEMO: See a healthcare bot built using the Azure Bot Service:
<a target=”_blank”” href=” https://www.microsoft.com/research/project/health-bot/”> https://www.microsoft.com/research/project/health-bot</a>
Select the option to Try a demo of an example end-user experience. Use the web chat interface to interact with the bot.
MS LEARN: Create a Bot with the Bot Framework Composer
Run the Python Jupyter notebook
Sign in using the Microsoft account associated with your Azure subscription.
PROTIP: Use NVM to install Node
https://github.com/Microsoft/botbuilder-tools#install-cli-tools says to install Node.js version 10.14.1 or higher
https://github.com/microsoft/botframework-cli says to install Node.js version 12
Since the current version is now 16, we cannot use the command suggested in the doc:
npm i -g @microsoft/botframework-cli
bf is the bot framework CLI command
One-stop-shop CLI to manage your bot’s resources. BF CLI and AZ CLI together cover your end-to-end bot development workflow needs. VERSION @microsoft/botframework-cli/4.13.3 darwin-x64 node-v16.1.0 USAGE $ bf [COMMAND] COMMANDS chatdown Converts chat dialog files in <filename>.chat format into transcript files. Writes corresponding <filename>.transcript for each .chat file. config Configure various settings within the cli. dialog Dialog related commands for working with .schema and .dialog files. help display help for bf lg Parse, collate, expand and translate lg files. luis Manages LUIS assets on service and/or locally. orchestrator Display Orchestrator CLI available commands plugins Install, uninstall and show installed plugins qnamaker QnA Maker
References:
References:
OpenAI is a San Francisco-based artificial intelligence research laboratory. OpenAI was founded by Elon Musk, Sam Altman, Greg Brockman, and Ilya Sutskever in December 2015 (to compete with Google’s DeepMind acquisition).
In 2019 Microsoft invested $1 billion in the company as time on Azure cloud and to develop a large-scale AI supercomputer built exclusively for OpenAI’s research in Azure. Azure powers all of OpenAI’s workloads.
In 2023, OpenAI exclusively licensed (closed-source) GPT-3 to Microsoft for their products and services.
In 2022, OpenAI made available their GPT-3.5 foundation model for free trial, offering several categories of capabilities
Source: https://openai.com/blog/openai-microsoft/
OpenAI’s avowed mission is to create Artificial General Intelligence (AGI) (to rival human ability).
OpenAI is a “separate” service from Azure Cognitive Services because now “traditional” Azure Cognitive Services focuses on making predictions based textual and discrete data where OpenAI added to ML “attention” algorithms working on binary data (voice, images, and video) to enable it to perform “Generative AI” which produces new content based on what is described in the input.
Consider pricing at https://azure.microsoft.com/pricing/details/cognitive-services/openai-service
OpenAI put a paywall behind its more advanced GPT-4 models,
Billing is based on 1,000 “tokens” increments, with the first 100,000 tokens per month free. Beyond that:
“Standard” use of the older/more limited gpt-3.5-turbo model is then $0.002 per 1,000 tokens.
Charges for the GPT-4 model has two dimensions: process stage and the size of the foundational model used :
The “32K context” has 175 billion parameters.
Images generated using DALL-E are $2 per 100 images.
Available for free for the first 100,000 tokens per month, then $0.004 per 1,000 tokens.
Apply for access to Asure OpenAI for your Region and Currency:
AOAI GPT-3.5, GPT-3.5 Turbo, GPT-4, GPT-4 Turbo, and/or Embeddings Models (Conversational AI, Search, Summarization, Writing Assistance or content generation, Code-based scenarios, Reason over Structured and Unstructured data) (a Limited Access Cognitive Service):
Chat and conversation interaction: Users can interact with a conversational agent that responds with responses drawn from trusted documents such as internal company documentation or tech support documentation; conversations must be limited to answering scoped questions. Available to internal, authenticated external users, and unauthenticated external users.
Chat and conversation creation: Users can create a conversational agent that responds with responses drawn from trusted documents such as internal company documentation or tech support documentation; conversations must be limited to answering scoped questions. Limited to internal users only.
Code generation or transformation scenarios: For example, converting one programming language to another, generating docstrings for functions, converting natural language to SQL. Limited to internal and authenticated external users.
Journalistic content: For use to create new journalistic content or to rewrite journalistic content submitted by the user as a writing aid for pre-defined topics. Users cannot use the application as a general content creation tool for all topics. May not be used to generate content for political campaigns. Limited to internal users.
Question-answering: Users can ask questions and receive answers from trusted source documents such as internal company documentation. The application does not generate answers ungrounded in trusted source documentation. Available to internal, authenticated external users, and unauthenticated external users.
Reason over structured and unstructured data: Users can analyze inputs using classification, sentiment analysis of text, or entity extraction. Examples include analyzing product feedback sentiment, analyzing support calls and transcripts, and refining text-based search with embeddings. Limited to internal and authenticated external users.
Search: Users can search trusted source documents such as internal company documentation. The application does not generate results ungrounded in trusted source documentation. Available to internal, authenticated external users, and unauthenticated external users.
Summarization: Users can submit content to be summarized for pre-defined topics built into the application and cannot use the application as an open-ended summarizer. Examples include summarization of internal company documentation, call center transcripts, technical reports, and product reviews. Limited to internal, authenticated external users, and unauthenticated external users.
Writing assistance on specific topics: Users can create new content or rewrite content submitted by the user as a writing aid for business content or pre-defined topics. Users can only rewrite or create content for specific business purposes or pre-defined topics and cannot use the application as a general content creation tool for all topics. Examples of business content include proposals and reports. May not be selected to generate journalistic content (for journalistic use, select the above Journalistic content use case). Limited to internal users and authenticated external users.
Data generation for fine-tuning: Users can use a model in Azure OpenAI to generate data which is used solely to fine-tune (i) another Azure OpenAI model, using the fine-tuning capabilities of Azure OpenAI, and/or (ii) another Azure AI custom model, using the fine-tuning capabilities of the Azure AI service. Generating data and fine-tuning models is limited to internal users only; the fine-tuned model may only be used for inferencing in the applicable Azure AI service and, for Azure OpenAI service, only for customer’s permitted use case(s) under this form.
DALL-E 2 and/or DALL-E 3 models (text to image)
Accessibility Features: For use to generate imagery for visual description systems. Limited to internal users and authenticated external users.
Art and Design: For use to generate imagery for artistic purposes only for designs, artistic inspiration, mood boards, or design layouts. Limited to internal and authenticated external users.
Communication: For use to create imagery for business-related communication, documentation, essays, bulletins, blog posts, social media, or memos. This use case may not be selected to generate images for political campaigns or journalistic content (for journalistic use, see the Journalistic content use case below). Limited to internal and authenticated external users.
Education: For use to create imagery for enhanced or interactive learning materials, either for use in educational institutions or for professional training. Limited to internal users and authenticated external users.
Entertainment: For use to create imagery to enhance entertainment content such as video games, movies, TV, videos, recorded music, podcasts, audio books, or augmented or virtual reality. This use case may not be selected to generate images for political campaigns or journalistic content (for journalistic use, see the below Journalistic content use case). Limited to internal and authenticated external users.
Journalistic content: For use to create imagery to enhance journalistic content. May not be used to generate images for political campaigns. Limited to internal users.
Marketing: For use to create marketing materials for product or service media, product introductions, business promotion, or advertisements. May not be used to create personalized or targeted advertisements to individuals. This use case may not be selected to generate images for political campaigns or journalistic content (for journalistic use, see the above Journalistic content use case). Limited to internal and authenticated external users.
Most Valuable Professional (MVP) or Regional Director (RD) Demo Use: Azure OpenAI Service DALL·E capability (in accordance with a use case listed in this Question [X]). No production use, sale, or other disposition of an application is permitted under this use case; if an MVP, RD, or their employer wants to use an Azure OpenAI Service application in production, a separate form must be submitted, the appropriate use case must be selected, and a separate eligibility determination will be made.
OpenAI Whisper model (Speech-to-Text)
GPT-4 Turbo with Vision
Chat and conversation interaction: Users can interact with a conversational agent that responds with information drawn from trusted documentation such as internal company documentation or tech support documentation. Conversations must be limited to answering scoped questions. Available to internal, authenticated external users, and unauthenticated external users.
Chatbot and conversational agent creation: Users can create conversational agents that respond with information drawn from trusted documents such as internal company documentation or tech support documents. For instance, diagrams, charts, and other relevant images from technical documentation can enhance comprehension and provide more accurate responses. Conversations must be limited to answering scoped questions. Limited to internal users only.
Code generation or transformation scenarios: Converting one programming language to another or enabling users to generate code using natural language or visual input. For example, users can take a photo of handwritten pseudocode or diagrams illustrating a coding concept and use the application to generate code based on that. Limited to internal and authenticated external users.
Reason over structured and unstructured data: Users can analyze inputs using classification, sentiment analysis of text, or entity extraction. Users can provide an image alongside a text query for analysis. Limited to internal and authenticated external users.
Summarization: Users can submit content to be summarized for pre-defined topics built into the application and cannot use the application as an open-ended summarizer. Examples include summarization of internal company documentation, call center transcripts, technical reports, and product reviews. Limited to internal, authenticated external users, and unauthenticated external users.
Writing assistance on specific topics: Users can create new content or rewrite content submitted by the user as a writing aid for business content or pre-defined topics. Users can only rewrite or create content for specific business purposes or pre-defined topics and cannot use the application as a general content creation tool for all topics. Examples of business content include proposals and reports. May not be selected to generate journalistic content (for journalistic use, select the above Journalistic content use case). Limited to internal users and authenticated external users.
Search: Users can search for content in trusted source documents and files such as internal company documentation. The application does not generate results ungrounded in trusted source documentation. Limited to internal users only.
Image and Video Tagging: Users can identify and tag visual elements, including objects, living beings, scenery, and actions within an image or recorded video. Users may not attempt to use the service to identify individuals. Limited to internal users and authenticated external users.
Image and Video Captioning: Users can generate descriptive natural language captions for visuals. Beyond simple descriptions, the application can identify and provide textual insights about specific subjects or landmarks within images and recorded video. If shown an image of the Eiffel Tower, the system might offer a concise description or highlight intriguing facts about the monument. Generated descriptions of people may not be used to identify individuals. Limited to internal users and authenticated external users.
Object Detection: For use to identify the positions of individual or multiple objects in an image by providing their specific coordinates. For instance, in an image that has scattered apples, the application can identify and indicate the location of each apple. Through this application, users can obtain spatial insights regarding objects captured in images. This use case is not yet available for videos. Limited to internal users and authenticated external users.
Visual Question Answering: Users can ask questions about an image or video and receive contextually relevant responses. For instance, when shown a picture of a bird, one might ask, “What type of bird is this?” and receive a response like, “It’s a European robin.” The application can identify and interpret context within images and videos to answer queries. For example, if presented with an image of a crowded marketplace, users can ask, “How many people are wearing hats?” or “What fruit is the vendor selling?” and the application can provide the answers. The system may not be used to answer identifying questions about people. Limited to internal users and authenticated external users.
Brand and Landmark recognition: The application can be used to identify commercial brands and popular landmarks in images or videos from a preset database of thousands of global logos and landmarks. Limited to internal users and authenticated external users.
https://docs.microsoft.com/en-us/azure/machine-learning/data-science-virtual-machine/overview#whats-included-in-the-data-science-vm
Microsoft AI ML Community in Signapore
If you have an OReilly.com account:
On Udemy:
Steps for data transformation:
https://docs.microsoft.com/en-us/azure/cognitive-services/custom-vision-service/limits-and-quotas
https://techcommunity.microsoft.com/t5/ai-azure-ai-services-blog/the-ai-study-guide-azure-machine-learning-edition/ba-p/4063656
https://azure.github.io/Cloud-Native/ https://github.com/azure/cloud-native is a showcase on Azure Cloud Native, the products, events and how to get started or go deep with cloud native technologies, including Serverless on Azure.
Using Microsoft’s API algorithms and data (such as celebrity faces, landmarks, etc.) means there has be some vetting by Microsoft’s FATE (Fairness, Accountability, Transparency, and Ethics) research group in NYC:
Microsoft’s ethical principles guiding the development and use of artificial intelligence with people:
Resources:
Mehrnoosh Sameki has a https://cs-people.bu.edu/sameki/ResponsibleAI.html course in Responsible AI
“AI Alignment” refers to unintended consequences.
VIDEO: Another OpenAI Scientist QUITS —Says AGI Is a ‘TICKING TIME BOMB’
For Azure AI Foundry Models you pay for:
MaaP = Modela as a P? means You pay for the compute sized chosen. Microsoft offers GPU Model tuning where you can provide MaaP models to dedicated GPU Hardware such as A100 and then complete fine tuning see https://aka.ms/ignite/pre016 for examples for fine tuning with Azure.
MaaS = Models as a Service for Tokens consumed for pay-as-you-go inference APIs and hosted fine-tuning for Llama 2 family models. Currently, there’s no extra charge for Azure AI Foundry outside of typical AI services and other Azure resource charges.
For GPU acceleration, it’s hard to get cost effectively on Azure AI foundry. So look at the cost of Hardware for Azure GPU VM Compute costs see the Azure pricing calc for estimates https://azure.microsoft.com/en-gb/pricing/calculator/
Lee Stott wrote in Oct 2024 “The Future of AI” says:
To fine tune your own model, follow https://aka.ms/ignite/pre016 which shows you how to fine tune models using Azure AI Foundry 1 click fine tuning or Microsoft Olive Pipeline fine tuning.
but you can also use the tool to fine tune model from hugging face.
GitHub Models are available at https://github.com/marketplace/models and each model card displays the quota/capacity limit
Yujian Tang wrote in Oct 2024 “Advanced RAG Applications with Vector Databases” makes use of Python 3.11 on VScode.
This is one of a series on AI, Machine Learning, Deep Learning, Robotics, and Analytics: