PrivyDoc: Building a Zero Data Leak AI with Foundry Local & Microsoft Agent Framework
Microsoft Tech Community | December 2025
Motive / Why I Wrote This
In today's data-driven world, organizations face a critical challenge: how to leverage powerful AI capabilities while ensuring complete data privacy. After seeing numerous teams struggle with this dilemma—either avoiding AI adoption entirely due to privacy concerns or risking data exposure through cloud-based solutions—I recognized the need for a comprehensive guide to local-first AI development.
I wrote this article to demonstrate how Microsoft Foundry Local enables developers to build sophisticated AI applications that process sensitive documents entirely on-device, with zero data transmission to external services. The goal was to provide a practical blueprint for implementing privacy-first AI solutions that meet the strictest compliance requirements while still delivering powerful document analysis capabilities.
The motivation stemmed from conversations with legal teams, healthcare organizations, and government agencies who needed AI-powered document analysis but couldn't use cloud services due to regulatory restrictions. By showcasing PrivyDoc as a reference implementation, I aimed to prove that production-quality AI applications can be built with complete data sovereignty.
The Privacy Challenge in AI Adoption
Organizations today face an increasingly complex landscape when it comes to AI adoption. On one hand, AI-powered document analysis can dramatically improve efficiency, accuracy, and insight generation. On the other hand, the sensitive nature of many documents—legal contracts, medical records, financial statements, proprietary research—makes cloud-based processing a non-starter for many organizations.
Regulatory Landscape
The regulatory environment has become increasingly stringent about data handling:
- HIPAA requires healthcare organizations to protect patient information with strict controls on data transmission
- GDPR mandates that European citizens' data receive special protections, including data residency requirements
- SOC 2 compliance demands demonstrable control over where and how data is processed
- ITAR/EAR restrictions prevent certain defense-related information from leaving controlled environments
- Financial regulations like PCI-DSS and SOX impose strict requirements on handling financial data
For organizations subject to these regulations, the question isn't whether to use AI—it's how to use AI without violating compliance requirements.
The Air-Gap Requirement
In the most security-sensitive environments—government facilities, defense contractors, financial trading floors—systems operate in complete network isolation. These "air-gapped" environments present unique challenges for AI adoption:
- No cloud connectivity is possible
- All processing must happen on local hardware
- Models and dependencies must be pre-installed
- Updates and improvements require physical access
PrivyDoc was designed specifically to address these challenges, bringing enterprise-grade AI capabilities to completely isolated environments.
Overview
PrivyDoc represents a new paradigm in AI application development: bringing powerful language models directly to the user's device, ensuring that sensitive documents never leave the local environment. This comprehensive guide walks developers through building a complete document intelligence solution using Microsoft Foundry Local and the Microsoft Agent Framework.
The article begins by establishing the critical importance of data privacy in AI applications. It explores the tension between leveraging AI capabilities and protecting sensitive information, introducing local-first AI as the solution. The discussion covers regulatory requirements like HIPAA, GDPR, and industry-specific compliance needs that prevent many organizations from using cloud-based AI services, positioning on-device processing as the key to unlocking AI adoption in these environments.
Microsoft Foundry Local: The Foundation
Microsoft Foundry Local receives thorough treatment as the foundation for local AI deployment. The article explains how Foundry Local enables running sophisticated language models like Phi-3.5-mini and Qwen 2.5 entirely on consumer hardware, without requiring cloud connectivity after initial model download.
Hardware Requirements and Optimization
Implementation guidance covers model selection based on hardware constraints:
- Memory Management: Systems with 8-16GB RAM can effectively run lightweight models with proper optimization
- CPU vs GPU: Strategies for maximizing performance on both CPU-only and GPU-accelerated systems
- Model Loading: Techniques for efficient model initialization and memory management
- Inference Optimization: Batching strategies and caching patterns for improved throughput
Model Selection Guide
The article provides detailed guidance on choosing the right model for different use cases:
| Model | RAM Required | Best For | Tradeoffs |
|---|---|---|---|
| qwen2.5-0.5b | 4GB | Quick summaries, basic NER | Lower accuracy on complex tasks |
| phi-3.5-mini | 8GB | Balanced performance | Good general-purpose choice |
| phi-4 | 16GB+ | Complex analysis, nuanced understanding | Higher resource requirements |
Multi-Agent Architecture
The multi-agent architecture forms a central element of the implementation. The article details how specialized agents work together to process documents:
Agent Roles and Responsibilities
-
Text Extraction Agent: Handles PDF and DOCX parsing while preserving document structure, including tables, lists, and formatting hierarchy
-
Entity Recognition Agent: Identifies people, organizations, locations, dates, and custom entities using NER techniques optimized for local execution
-
Summarization Agent: Generates concise overviews of entire documents or specific sections, with configurable length and focus areas
-
Sentiment Analysis Agent: Evaluates emotional tone at both document and section levels, useful for contract analysis and communication review
Agent Communication Patterns
Each agent's design, implementation, and integration patterns are covered with code examples:
- Sequential Processing: How agents pass enriched data through the pipeline
- Parallel Execution: Running independent analyses simultaneously for improved performance
- Result Aggregation: Combining outputs from multiple agents into coherent reports
- Error Handling: Graceful degradation when individual agents encounter issues
Document Processing Pipelines
Document processing pipelines receive particular attention, demonstrating how to build robust workflows for handling diverse document formats.
PDF Processing
The article covers PDF text extraction using pdfplumber:
- Text Layer Extraction: Handling both native and scanned PDFs
- Layout Preservation: Maintaining column structures and reading order
- Table Recognition: Extracting structured data from tables
- Metadata Handling: Preserving document properties and annotations
DOCX Processing
DOCX processing with python-docx includes:
- Paragraph Extraction: Clean text extraction with style information
- Header/Footer Handling: Proper treatment of repeated elements
- Embedded Objects: Handling images, charts, and other embedded content
- Track Changes: Processing documents with revision history
Structure Recognition
Advanced techniques for identifying sections and hierarchy:
- Heading Detection: Automatic identification of document structure
- Section Boundaries: Intelligent grouping of related content
- Cross-Reference Handling: Maintaining links between document sections
- Normalization: Text cleaning that preserves semantic meaning
Chainlit Web Interface
The Chainlit web interface implementation shows how to create an intuitive user experience for document analysis. Coverage includes:
User Interface Components
- File Upload: Drag-and-drop and button-based file upload with validation
- Progress Tracking: Real-time feedback during analysis with stage indicators
- Result Exploration: Interactive navigation of analysis results
- Export Options: One-click export to Markdown, JSON, or CSV formats
Accessibility Considerations
The discussion addresses user experience considerations that make complex AI capabilities accessible to non-technical users:
- Clear Visual Hierarchy: Organizing results for easy scanning
- Contextual Help: In-app guidance for interpreting results
- Error Messages: User-friendly explanations when issues occur
- Responsive Design: Consistent experience across devices
Security and Compliance Features
Security and compliance features round out the implementation, covering the techniques that guarantee zero data transmission.
Network Isolation Verification
The article explains network isolation verification:
- Connection Monitoring: Verifying no outbound connections occur during processing
- Firewall Integration: Working within corporate security infrastructure
- Audit Logging: Comprehensive records of all processing activities
Compliance-Ready Patterns
Additional security features include:
- Document Fingerprinting: Verify document integrity and processing history
- Access Controls: Role-based permissions for sensitive analyses
- Data Retention: Configurable policies for result storage and cleanup
- Audit Trails: Complete logging for regulatory compliance
Air-Gap Deployment
Air-gap deployment scenarios receive special attention for the most security-sensitive environments:
- Offline Installation: Complete setup without network access
- Model Pre-loading: Strategies for deploying models to isolated systems
- Update Procedures: Secure methods for applying updates and improvements
Implementation Highlights
Code Architecture
The PrivyDoc implementation follows best practices for maintainable, extensible code:
- Modular Design: Each component can be updated or replaced independently
- Configuration-Driven: Behavior controlled through settings rather than code changes
- Comprehensive Logging: Detailed logging for debugging and audit purposes
- Error Recovery: Graceful handling of edge cases and failures
Performance Optimization
Key optimizations implemented in PrivyDoc:
- Lazy Loading: Models loaded only when needed
- Result Caching: Avoiding redundant processing
- Batch Processing: Efficient handling of multiple documents
- Memory Management: Aggressive cleanup to support resource-constrained environments
Frameworks & Tools Covered
- Microsoft Foundry Local
- Microsoft Agent Framework
- Phi-3.5-mini and Qwen 2.5 language models
- Python 3.10+ development
- Chainlit web framework
- pdfplumber for PDF extraction
- python-docx for DOCX processing
- Named Entity Recognition (NER)
- Sentiment analysis techniques
- JSON-based local storage
- Multi-agent orchestration patterns
- Air-gap deployment strategies
Learning Outcomes
- Understand local-first AI architecture patterns using Microsoft Foundry Local
- Build multi-agent document processing pipelines with specialized analysis agents
- Implement secure document analysis workflows with guaranteed data privacy
- Create intuitive web interfaces for AI applications using Chainlit
- Design air-gap compatible systems for offline operation
- Optimize lightweight language models for resource-constrained environments
- Develop comprehensive entity recognition and sentiment analysis capabilities
- Apply compliance-ready patterns for regulated industries
Impact / Results
This article provides developers with a complete blueprint for building privacy-first AI applications. The PrivyDoc implementation demonstrates that sophisticated document analysis can be achieved entirely on-device, opening AI adoption opportunities for organizations that previously couldn't use cloud-based services due to privacy requirements.
The multi-agent architecture patterns have broad applicability beyond document analysis, showing how specialized agents can collaborate to handle complex AI tasks. The Foundry Local integration techniques provide a foundation for building any local-first AI application that requires complete data sovereignty.
Community Engagement
Published on Microsoft Tech Community, this article contributes to the growing body of knowledge around privacy-preserving AI development and local-first application architecture.
Article Navigation
Category: Project Articles
Related Articles:
Related Project: PrivyDoc Project