PrivyDoc: Building a Zero Data Leak AI with Foundry Local & Microsoft Agent Framework

Microsoft Tech Community | December 2025

Motive / Why I Wrote This

In today's data-driven world, organizations face a critical challenge: how to leverage powerful AI capabilities while ensuring complete data privacy. After seeing numerous teams struggle with this dilemma—either avoiding AI adoption entirely due to privacy concerns or risking data exposure through cloud-based solutions—I recognized the need for a comprehensive guide to local-first AI development.

I wrote this article to demonstrate how Microsoft Foundry Local enables developers to build sophisticated AI applications that process sensitive documents entirely on-device, with zero data transmission to external services. The goal was to provide a practical blueprint for implementing privacy-first AI solutions that meet the strictest compliance requirements while still delivering powerful document analysis capabilities.

The motivation stemmed from conversations with legal teams, healthcare organizations, and government agencies who needed AI-powered document analysis but couldn't use cloud services due to regulatory restrictions. By showcasing PrivyDoc as a reference implementation, I aimed to prove that production-quality AI applications can be built with complete data sovereignty.

The Privacy Challenge in AI Adoption

Organizations today face an increasingly complex landscape when it comes to AI adoption. On one hand, AI-powered document analysis can dramatically improve efficiency, accuracy, and insight generation. On the other hand, the sensitive nature of many documents—legal contracts, medical records, financial statements, proprietary research—makes cloud-based processing a non-starter for many organizations.

Regulatory Landscape

The regulatory environment has become increasingly stringent about data handling:

HIPAA requires healthcare organizations to protect patient information with strict controls on data transmission
GDPR mandates that European citizens' data receive special protections, including data residency requirements
SOC 2 compliance demands demonstrable control over where and how data is processed
ITAR/EAR restrictions prevent certain defense-related information from leaving controlled environments
Financial regulations like PCI-DSS and SOX impose strict requirements on handling financial data

For organizations subject to these regulations, the question isn't whether to use AI—it's how to use AI without violating compliance requirements.

The Air-Gap Requirement

In the most security-sensitive environments—government facilities, defense contractors, financial trading floors—systems operate in complete network isolation. These "air-gapped" environments present unique challenges for AI adoption:

No cloud connectivity is possible
All processing must happen on local hardware
Models and dependencies must be pre-installed
Updates and improvements require physical access

PrivyDoc was designed specifically to address these challenges, bringing enterprise-grade AI capabilities to completely isolated environments.

Overview

PrivyDoc represents a new paradigm in AI application development: bringing powerful language models directly to the user's device, ensuring that sensitive documents never leave the local environment. This comprehensive guide walks developers through building a complete document intelligence solution using Microsoft Foundry Local and the Microsoft Agent Framework.

The article begins by establishing the critical importance of data privacy in AI applications. It explores the tension between leveraging AI capabilities and protecting sensitive information, introducing local-first AI as the solution. The discussion covers regulatory requirements like HIPAA, GDPR, and industry-specific compliance needs that prevent many organizations from using cloud-based AI services, positioning on-device processing as the key to unlocking AI adoption in these environments.

Microsoft Foundry Local: The Foundation

Microsoft Foundry Local receives thorough treatment as the foundation for local AI deployment. The article explains how Foundry Local enables running sophisticated language models like Phi-3.5-mini and Qwen 2.5 entirely on consumer hardware, without requiring cloud connectivity after initial model download.

Hardware Requirements and Optimization

Implementation guidance covers model selection based on hardware constraints:

Memory Management: Systems with 8-16GB RAM can effectively run lightweight models with proper optimization
CPU vs GPU: Strategies for maximizing performance on both CPU-only and GPU-accelerated systems
Model Loading: Techniques for efficient model initialization and memory management
Inference Optimization: Batching strategies and caching patterns for improved throughput

Model Selection Guide

The article provides detailed guidance on choosing the right model for different use cases:

Model	RAM Required	Best For	Tradeoffs
qwen2.5-0.5b	4GB	Quick summaries, basic NER	Lower accuracy on complex tasks
phi-3.5-mini	8GB	Balanced performance	Good general-purpose choice
phi-4	16GB+	Complex analysis, nuanced understanding	Higher resource requirements

Multi-Agent Architecture

The multi-agent architecture forms a central element of the implementation. The article details how specialized agents work together to process documents:

Agent Roles and Responsibilities

Text Extraction Agent: Handles PDF and DOCX parsing while preserving document structure, including tables, lists, and formatting hierarchy
Entity Recognition Agent: Identifies people, organizations, locations, dates, and custom entities using NER techniques optimized for local execution
Summarization Agent: Generates concise overviews of entire documents or specific sections, with configurable length and focus areas
Sentiment Analysis Agent: Evaluates emotional tone at both document and section levels, useful for contract analysis and communication review

Agent Communication Patterns

Each agent's design, implementation, and integration patterns are covered with code examples:

Sequential Processing: How agents pass enriched data through the pipeline
Parallel Execution: Running independent analyses simultaneously for improved performance
Result Aggregation: Combining outputs from multiple agents into coherent reports
Error Handling: Graceful degradation when individual agents encounter issues

Document Processing Pipelines

Document processing pipelines receive particular attention, demonstrating how to build robust workflows for handling diverse document formats.

PDF Processing

The article covers PDF text extraction using pdfplumber:

Text Layer Extraction: Handling both native and scanned PDFs
Layout Preservation: Maintaining column structures and reading order
Table Recognition: Extracting structured data from tables
Metadata Handling: Preserving document properties and annotations

DOCX Processing

DOCX processing with python-docx includes:

Paragraph Extraction: Clean text extraction with style information
Header/Footer Handling: Proper treatment of repeated elements
Embedded Objects: Handling images, charts, and other embedded content
Track Changes: Processing documents with revision history

Structure Recognition

Advanced techniques for identifying sections and hierarchy:

Heading Detection: Automatic identification of document structure
Section Boundaries: Intelligent grouping of related content
Cross-Reference Handling: Maintaining links between document sections
Normalization: Text cleaning that preserves semantic meaning

Chainlit Web Interface

The Chainlit web interface implementation shows how to create an intuitive user experience for document analysis. Coverage includes:

User Interface Components

File Upload: Drag-and-drop and button-based file upload with validation
Progress Tracking: Real-time feedback during analysis with stage indicators
Result Exploration: Interactive navigation of analysis results
Export Options: One-click export to Markdown, JSON, or CSV formats

Accessibility Considerations

The discussion addresses user experience considerations that make complex AI capabilities accessible to non-technical users:

Clear Visual Hierarchy: Organizing results for easy scanning
Contextual Help: In-app guidance for interpreting results
Error Messages: User-friendly explanations when issues occur
Responsive Design: Consistent experience across devices

Security and Compliance Features

Security and compliance features round out the implementation, covering the techniques that guarantee zero data transmission.

Network Isolation Verification

The article explains network isolation verification:

Connection Monitoring: Verifying no outbound connections occur during processing
Firewall Integration: Working within corporate security infrastructure
Audit Logging: Comprehensive records of all processing activities

Compliance-Ready Patterns

Additional security features include:

Document Fingerprinting: Verify document integrity and processing history
Access Controls: Role-based permissions for sensitive analyses
Data Retention: Configurable policies for result storage and cleanup
Audit Trails: Complete logging for regulatory compliance

Air-Gap Deployment

Air-gap deployment scenarios receive special attention for the most security-sensitive environments:

Offline Installation: Complete setup without network access
Model Pre-loading: Strategies for deploying models to isolated systems
Update Procedures: Secure methods for applying updates and improvements

Implementation Highlights

Code Architecture

The PrivyDoc implementation follows best practices for maintainable, extensible code:

Modular Design: Each component can be updated or replaced independently
Configuration-Driven: Behavior controlled through settings rather than code changes
Comprehensive Logging: Detailed logging for debugging and audit purposes
Error Recovery: Graceful handling of edge cases and failures

Performance Optimization

Key optimizations implemented in PrivyDoc:

Lazy Loading: Models loaded only when needed
Result Caching: Avoiding redundant processing
Batch Processing: Efficient handling of multiple documents
Memory Management: Aggressive cleanup to support resource-constrained environments

Frameworks & Tools Covered

Microsoft Foundry Local
Microsoft Agent Framework
Phi-3.5-mini and Qwen 2.5 language models
Python 3.10+ development
Chainlit web framework
pdfplumber for PDF extraction
python-docx for DOCX processing
Named Entity Recognition (NER)
Sentiment analysis techniques
JSON-based local storage
Multi-agent orchestration patterns
Air-gap deployment strategies

Learning Outcomes

Understand local-first AI architecture patterns using Microsoft Foundry Local
Build multi-agent document processing pipelines with specialized analysis agents
Implement secure document analysis workflows with guaranteed data privacy
Create intuitive web interfaces for AI applications using Chainlit
Design air-gap compatible systems for offline operation
Optimize lightweight language models for resource-constrained environments
Develop comprehensive entity recognition and sentiment analysis capabilities
Apply compliance-ready patterns for regulated industries

Impact / Results

This article provides developers with a complete blueprint for building privacy-first AI applications. The PrivyDoc implementation demonstrates that sophisticated document analysis can be achieved entirely on-device, opening AI adoption opportunities for organizations that previously couldn't use cloud-based services due to privacy requirements.

The multi-agent architecture patterns have broad applicability beyond document analysis, showing how specialized agents can collaborate to handle complex AI tasks. The Foundry Local integration techniques provide a foundation for building any local-first AI application that requires complete data sovereignty.

Community Engagement

Published on Microsoft Tech Community, this article contributes to the growing body of knowledge around privacy-preserving AI development and local-first application architecture.

Category: Project Articles

Related Articles:

Related Project: PrivyDoc Project

Read Full Article

Read on Microsoft Tech Community