
Building a Self-Maintaining Application Catalogue with Graph API and AI
The application catalogue that maintains itself — pulling inventory from Graph API, classifying with AI, and surfacing your MSIX migration shortlist automatically.
In Part 4 of the Windows Endpoint Management series, I described the catalogue problem in about 200 words and waved my hands at a solution. Graph API, Power Automate, AI classification — the building blocks were named but not assembled. This post is the 4,000-word version where we actually build the thing.
If you've worked in endpoint management for any length of time, you already know the problem. Someone asks "what applications do we have deployed?" and the answer takes three weeks of archaeology through Intune portals, spreadsheets of varying vintage, and conversations with people who left the organisation two years ago. The catalogue that was supposed to answer this question died quietly in a SharePoint list that nobody updated after March.
This post fixes that. Not by being more disciplined about spreadsheets — that's been tried — but by building a pipeline that maintains the catalogue automatically, enriches it with AI, and surfaces the exact outputs you need to make migration and governance decisions.
Why Manual Catalogues Die
Part 3 introduced the application catalogue as "a centralised, authoritative record of every application in your estate." The fields were clear: application name, version, packaging format, owner, deployment method, ring assignments, dependencies, lifecycle status. The advice was sound: export your Intune app list, enrich it, and maintain it.
Here's what actually happens.
Someone exports the list. They spend an afternoon adding columns for owner and lifecycle status. They present it at a team meeting. Everyone agrees it's valuable. For about three weeks, new applications get added and versions get updated. Then a deadline hits, the catalogue update gets skipped once, then twice, and within a quarter it's a historical artefact — accurate as of the date it was last touched, misleading for everything after.
The failure mode isn't laziness. It's that manual catalogue maintenance requires ongoing effort with no immediate reward. The person updating the spreadsheet doesn't benefit from doing it. The benefit accrues later, to a different person, when they need to answer a question the catalogue can answer. This asymmetry between effort and benefit kills every manual catalogue eventually.
The fix is obvious in retrospect: stop asking humans to do what machines can do. Intune already knows what's deployed and what's installed. The data exists. It just needs extracting, correlating, and enriching — and all three of those steps can be automated.
Architecture
DATA SOURCES
PROCESSING
ACTIONABLE OUTPUTS
The pipeline has three tiers:
Data collection. The Intune Graph API exposes two datasets: managed apps (what you intentionally deploy) and detected apps (what's actually installed). A scheduled process pulls both.
Processing. The raw data is normalised, deduplicated, and sent to an AI model for classification. The AI correlates the two datasets, identifies gaps, and enriches each entry with actionable metadata — is this app orphaned? Is it an MSIX candidate? Is it past end-of-life?
Output. Results land in a SharePoint list that serves as the living catalogue. Views and filters surface the actionable items: orphaned software to investigate, MSIX conversion candidates to prioritise, applications flagged for retirement.
Why Power Automate? Because the audience for this pipeline is IT operations teams, and most of them already have Power Automate as part of their Microsoft 365 licensing. No Azure subscription, no custom infrastructure, no deployment pipeline to maintain. If you prefer code, the companion repository also provides a pure PowerShell implementation you can run from a local machine, Task Scheduler, Azure Automation, or a CI/CD pipeline.
What Graph API Gives You
Two endpoints do most of the work.
Managed Apps
GET /deviceManagement/mobileApps returns every application that your organisation has configured in Intune — Win32 packages, MSIX bundles, Store apps, WinGet deployments. The @odata.type property tells you the packaging format, which matters enormously for the MSIX migration analysis.
# Fetch managed apps with packaging format
$uri = "https://graph.microsoft.com/beta/deviceManagement/mobileApps?`$top=100"
$response = Invoke-RestMethod -Uri $uri -Headers $headers -Method Get
# The @odata.type reveals the format
# #microsoft.graph.win32LobApp → Win32
# #microsoft.graph.windowsAppX → MSIX
# #microsoft.graph.winGetApp → WinGet
# #microsoft.graph.microsoftStoreForBusinessApp → StoreDetected Apps
GET /deviceManagement/detectedApps returns what's actually installed across your estate, regardless of how it got there. This is the ground truth — every application the Intune agent has discovered on managed devices, complete with a device count showing how widespread each installation is.
# Fetch detected apps with device counts
$uri = "https://graph.microsoft.com/beta/deviceManagement/detectedApps?`$top=100"
$response = Invoke-RestMethod -Uri $uri -Headers $headers -Method Get
# Each entry includes:
# displayName, version, publisher, platform, deviceCountThe Gap
The managed list tells you what should be there. The detected list tells you what is there. The gap between them is your entire problem space:
| Scenario | Managed | Detected | Classification |
|---|---|---|---|
| App deployed and found on devices | Yes | Yes | Managed — healthy |
| App in Intune but not detected anywhere | Yes | No | Check assignments |
| App found on devices but not in Intune | No | Yes | Orphaned — investigate |
| Win32 app with no complex install logic | Yes | Yes | MSIX candidate |
That third row — installed but not managed — is where the security and compliance risk lives. Users installing software outside Intune, legacy deployment tools still pushing packages, vendor bloatware that came with hardware. You can't address what you can't see, and detected apps makes it visible.
Both endpoints paginate via @odata.nextLink. For estates with more than a few hundred apps, you'll need to follow the pagination chain. The repo scripts handle this automatically.
Building the Sync Flow
The flow runs daily (or on demand) and executes six stages:
1. Scheduled trigger. A daily recurrence at 06:00 UTC. For most organisations, daily is sufficient — application inventory doesn't change hour by hour. If you need tighter feedback, you can supplement the schedule with event-driven triggers when new apps are deployed.
2–3. Fetch managed and detected apps. Two parallel HTTP actions call the Graph API endpoints. Authentication uses an Azure AD app registration with DeviceManagementApps.Read.All permission — read-only, the pipeline never modifies your Intune tenant. Both actions handle pagination by following @odata.nextLink until all data is retrieved.
4. Normalize. Raw Graph API data needs cleaning before classification. Application names contain version numbers ("Google Chrome 121.0.6167.85" vs "Google Chrome"), publishers use inconsistent casing ("Microsoft Corporation" vs "microsoft corporation"), and the same application can appear multiple times across different detection contexts. The normalisation step standardises names, deduplicates entries, and merges the managed and detected datasets into a unified view.
5. AI classify. The normalised dataset is sent to an AI model with a carefully designed system prompt. Any model that supports structured JSON output works — OpenAI, Azure OpenAI, Claude, Gemini, or a local model via Ollama. The AI doesn't just match names — it understands that "Microsoft Visual C++ 2019 Redistributable" is a runtime dependency, that "Notepad++" with 187 devices and no Intune deployment is orphaned, and that "Google Chrome" as a Win32 package is a strong MSIX conversion candidate. More on the classification logic in the next section.
6. Write catalogue. Results sync to a SharePoint list using delta logic: match on a composite key (application name + publisher), update existing entries, create new ones, and flag entries that have disappeared since the last run. The pipeline never deletes — a disappeared app gets its lifecycle status updated to "Under Review" for human decision.
The AI Classification Step
This is the heart of the pipeline — the part that transforms raw inventory data into actionable intelligence.
Five Categories
Each detected application receives exactly one classification:
Managed. The detected app matches a managed Intune deployment. This is the healthy state — you're intentionally deploying this software and it's landing on devices as expected. No action required.
Orphaned. The app is installed across your estate but isn't deployed or managed via Intune. It arrived via legacy deployment tools, user self-install, vendor pre-installation, or some other uncontrolled channel. Every orphaned app is a compliance and security question that needs answering.
Unowned. The app exists in Intune as a managed deployment, but there's no clear owner or responsible team. It's being deployed, but nobody is accountable for patching it, reviewing its business justification, or deciding when to retire it. This is a governance gap that compounds over time.
MSIX Candidate. The app is currently packaged as Win32 and appears suitable for MSIX conversion based on its characteristics: no kernel-mode drivers, no COM registration, no custom install actions, no Windows services. These are the "low-hanging fruit" that Part 3 argued should be your first conversion targets. The AI also assigns an MSIX readiness score from 1 (unlikely) to 5 (trivial).
Retirement. The app appears to be end-of-life, superseded by a newer product, or has a device count near zero suggesting it's no longer needed. These are candidates for a governed retirement process.
The Prompt Matters
The quality of classification depends entirely on the system prompt. A naive "classify these apps" instruction produces mediocre results. The system prompt in the repo encodes domain knowledge that the AI wouldn't otherwise have:
- Framework runtimes (.NET, Visual C++ Redistributable) are dependencies, not standalone applications
- An app detected on significantly more devices than its managed deployment count suggests shadow installations
- Version comparison with vendor lifecycle data identifies end-of-life software
- MSIX readiness assessment requires understanding which installer characteristics are incompatible with containerised execution
Structured output is essential. The prompt requests JSON matching a defined schema, not prose. Every classification includes the category, a confidence reason, the matched managed app (if any), and for MSIX candidates, the readiness score. Machine-readable output means the pipeline can act on results directly — creating SharePoint items, generating reports, populating migration backlogs — without human parsing.
Two Implementation Paths
Power Automate + AI. Use the HTTP connector to call your AI provider's API directly from your flow. The system prompt, few-shot examples, and app data go in the request body; structured JSON comes back. For organisations already in the Power Platform ecosystem, this keeps everything in one place.
PowerShell + API. The Invoke-AppClassification.ps1 script in the repo handles batching (splitting large estates into chunks that fit within token limits), retry logic, and structured output parsing. This path gives you more control and integrates naturally with CI/CD pipelines or Azure Automation.
Both paths use the same prompts and produce the same output schema. The repo defaults to OpenAI's chat completions format, but most providers either use the same format natively (Azure OpenAI, Ollama) or offer compatible endpoints (Gemini). For providers with different APIs — like Anthropic's Claude — you'll need to adjust the HTTP call, but the prompts and schema are provider-agnostic. Choose based on what your organisation already has access to.
The Catalogue as a Living Asset
The SharePoint list serves as the single source of truth. Its schema has two kinds of fields:
Automated fields — populated and updated by the pipeline on every run:
| Field | Source | Description |
|---|---|---|
| Application Name | Graph API | What the app is called |
| Publisher | Graph API | Who makes it |
| Version | Graph API | What's currently installed |
| Classification | AI | Managed, orphaned, unowned, MSIX candidate, or retirement |
| Device Count | Graph API | How many devices have it |
| Packaging Format | Graph API | Win32, MSIX, Store, etc. |
| MSIX Readiness | AI | Conversion difficulty score (1–5) |
| Classification Reason | AI | Why the AI classified it this way |
| Last Sync Date | Pipeline | When this entry was last refreshed |
Manual fields — require human judgement:
| Field | Description |
|---|---|
| Owner | Who's responsible for this application |
| Business Justification | Why the organisation needs it |
| Lifecycle Status | Active, Under Review, Deprecated, Retired |
| Notes | Context that doesn't fit elsewhere |
The pipeline never overwrites manual fields. When it updates an entry, it refreshes the automated columns and leaves human-entered data intact. This is what makes the catalogue maintainable — the tedious part (inventory and classification) is automated, and humans only contribute what only humans can: ownership decisions, business context, and governance judgements.
Delta sync means the pipeline matches incoming data against existing entries using a composite key (application name + publisher). New apps become new rows. Known apps get updated columns. Apps that disappear from the detected inventory have their lifecycle status flagged as "Under Review" rather than being deleted — because "no longer detected" doesn't necessarily mean "should be removed from the catalogue."
From Catalogue to Migration Backlog
This is the payoff. The catalogue isn't just documentation — it generates work.
Filter the catalogue view to "MSIX Candidates", sort by readiness score (descending) and device count (descending), and you have a prioritised migration backlog. The apps at the top — high readiness, high device count — deliver the most value for the least effort. These are your first batch.
Part 3 argued that MSIX should be the default packaging format, with Win32 reserved for genuine exceptions. The usual objection was: "We don't know which of our Win32 apps can actually be converted." This pipeline answers that question definitively. Not "maybe most of them" — a specific list, with specific readiness scores, backed by AI analysis of each application's characteristics.
A typical enterprise estate with 200+ applications usually surfaces 40–80 MSIX candidates, of which 15–30 score 4 or 5 on the readiness scale (straightforward or trivial conversions). That's a concrete starting point — not a vague "we should look into MSIX someday" initiative, but a specific backlog that a packaging team can work through.
The other outputs are equally actionable:
- Orphaned apps become a security review: who installed this, why, and should it be managed or removed?
- Unowned apps become a governance exercise: assign owners, document justification, or flag for retirement.
- Retirement candidates become a cleanup project: remove from the deployment, reclaim licences, reduce attack surface.
Each of these outputs can be pushed downstream — into Planner tasks, Azure DevOps work items, ServiceNow tickets, or whatever work management system your organisation uses. The pipeline doesn't just tell you what's wrong; it creates the tickets to fix it.
Running It in Practice
First Run
Expect noise. The first run against a real estate will over-classify, under-classify, and occasionally hallucinate. This is normal. Common issues:
- Framework runtimes classified as orphaned (because they're not in the managed app list as standalone deployments). The few-shot examples in the repo handle the most common ones, but you'll need to add examples specific to your estate.
- Name mismatches where the detected name differs significantly from the managed app name ("Adobe Acrobat Reader DC" vs "Adobe Acrobat Reader" vs "AcroRd32").
- MSIX readiness overestimates for apps that look simple from their name but have complex installation requirements.
Review the first run manually. Correct misclassifications by updating the few-shot examples in prompts/few-shot-examples.json. Each correction improves subsequent runs. After 2–3 iterations, classification accuracy stabilises significantly.
Steady State
Once tuned, the pipeline runs daily with minimal attention. What you monitor:
- Weekly digest: new orphaned apps (someone installed something new), classification changes (an app's risk profile shifted), and apps that disappeared from the estate.
- Catalogue coverage: what percentage of detected apps have been classified? This should be at or near 100%.
- Owner coverage: what percentage of managed apps have an assigned owner? This is your governance maturity metric — the pipeline highlights the gap, but closing it requires human action.
- Migration progress: how many MSIX candidates remain? Track this over time as your packaging team works through the backlog.
Where This Goes Next
This post built the data foundation. The catalogue is a living, self-maintaining record of your application estate — accurate, classified, and actionable. But data is only valuable if it drives decisions.
The next post in this series will take two of the catalogue's outputs — orphaned and unmanaged applications — and build an automated compliance remediation workflow. When an orphaned app appears, the pipeline won't just flag it; it'll determine the risk, draft a remediation plan, and route it to the right team for action.
If you've read Part 4 of the main series, you'll recognise this as one step toward the agentic loop: Monitor → Detect → Plan → Act → Verify. The catalogue is the "Detect" layer — knowing what you have. Compliance remediation is the "Plan and Act" layer — doing something about what you've found. The pieces are starting to connect.
But that's for next time. For now: clone the repo, point it at your tenant, and see what falls out. The first time you run the pipeline and it tells you that 23% of your installed software isn't managed by Intune, you'll understand why this catalogue matters — and why it needs to maintain itself.
This is Part 1 of the Practical AI for Endpoint Management series — hands-on builds that turn the concepts from the Windows Endpoint Management series into working systems.