- Course overview
- Course details
- Prerequisites
Course overview
About this course
This course focuses on building intelligent applications that can see, interpret, and reason over images and documents using different multimodal models and agent-based tools. Learners explore how visual and document inputs can be combined with language models to enable structured extraction, analysis, and decision-making workflows. The course emphasizes practical patterns for extracting information, orchestrating tools, and grounding model responses in visual data.
Audience profile
This course is designed for developers, AI engineers, and technical professionals who want to build applications that work with images and documents using multimodal, agent-driven approaches. It’s best suited for learners with basic programming experience and a general understanding of cloud or AI concepts.
Course details
Module 1: Develop a vision-enabled generative AI application
Module 2: Generate images with AI
Module 3: Generate videos with Microsoft Foundry
Module 4: Analyze images with Content Understanding
Module 5: Create a multimodal analysis solution with Azure Content Understanding
Module 6: Create an Azure Content Understanding client application
Module 7: Extract data with Azure Document Intelligence
Module 8: Create a knowledge mining solution with Azure AI Search
Prerequisites
- Familiarity with Azure and Microsoft Foundry.
- Programming experience.
Enquiry
Course : AI-3008: Extract insights from visual data on Azure
Enquiry
request for : AI-3008: Extract insights from visual data on Azure