FormPilot — Web Form Automation Framework
2025 | Python, Playwright MCP, Azure OpenAI, Agno | GitHub
FormPilot is a local automation framework that reads structured markdown files and fills out web forms automatically using Playwright MCP and Azure OpenAI. Built for developers and content creators who need to automate repetitive form submissions while maintaining complete control over their data.
Problem / Motivation
Manual form submission is tedious and error-prone, especially when dealing with:
- Repetitive data entry across multiple platforms or submissions
- Bulk content management requiring consistent formatting and validation
- Time-consuming workflows where form filling disrupts productivity
- Data inconsistencies from manual entry mistakes
- Limited automation tools that require complex configuration or cloud dependencies
FormPilot addresses these challenges by providing a markdown-driven, AI-powered automation solution that runs entirely on your local machine, ensuring data privacy while maximizing efficiency.
Core Functionalities
Markdown-Driven Data Management
- Define form data in simple, structured markdown files with clear field definitions
- Maintain version control over form data using standard markdown syntax
- Enable easy collaboration through readable, editable text files
AI-Powered Form Automation
- Uses Azure OpenAI (GPT-4/GPT-4o) for intelligent form field matching and content generation
- Implements fuzzy matching to intelligently map content to form dropdowns and fields
- Auto-generates missing descriptions when fields are incomplete
- Rewrites content automatically to meet character limits and formatting requirements
Playwright MCP Integration
- Leverages Model Context Protocol for robust browser automation
- Implements snapshot-based error recovery for resilient form filling
- Provides interactive review mode for verification before submission
- Handles complex form interactions including dropdowns, date pickers, and multi-step workflows
Smart Processing Features
- Batch processing with configurable batch sizes for efficiency
- Intelligent caching of dropdown options and generated content
- Adaptive retry logic with exponential backoff for rate limiting
- Sequential or batched modes to optimize for speed or control
Description / How It Works
- Data Preparation: Users create structured markdown files with activity/form data following a defined schema
- Parsing & Validation: Azure OpenAI parses markdown, validates required fields, and generates missing content
- Browser Automation: Playwright MCP initializes browser automation and navigates to target forms
- Intelligent Form Filling: AI-powered agent matches data to form fields using fuzzy matching and smart field selection
- Error Handling: Automatic retry logic with snapshot-based recovery handles failures gracefully
- Submission: Optional confirmation step before final submission, with detailed logging
Challenges & Issues Addressed
- Data Privacy: All processing happens locally—no external data transmission except Azure OpenAI API calls
- Form Field Matching: Fuzzy matching and fallback options handle varying dropdown labels and field formats
- Content Constraints: Automatic content rewriting ensures compliance with character limits
- Network Reliability: Adaptive retry logic with exponential backoff handles rate limits and timeouts
- Browser Automation Stability: Snapshot-based error recovery and element detection retry logic
- Batch Processing: Configurable batch sizes and processing modes optimize throughput
Tech Stack & Frameworks
- Languages / Frameworks: Python 3.11+, Node.js (for Playwright MCP)
- AI / ML: Azure OpenAI (GPT-4/GPT-4o), Agno agent framework
- Browser Automation: Playwright MCP (Model Context Protocol)
- Data Format: Markdown with structured field definitions
- Environment: Local execution with environment variable configuration
Features / Capabilities
- Local Processing: Complete data privacy with on-device processing
- Markdown-Based Data: Version-controllable, human-readable form definitions
- Smart Technology Mapping: Analyzes content to select appropriate technology categories
- Auto-Generated Content: Creates internal notes and descriptions when missing
- Character Limit Compliance: Automatically rewrites content to fit form constraints
- Batch & Sequential Modes: Process multiple entries efficiently
- Interactive Review: Optional confirmation before each submission
- Fast Mode: Optimized performance for bulk operations
- Error Recovery: Adaptive retry logic with snapshot-based recovery
- Progress Tracking: Detailed timing and status information
Potential Applications
- Content Management: Automate submissions to blogging platforms, community sites, and portfolios
- Research Documentation: Bulk entry of research activities and publications
- Portfolio Management: Batch updates to professional profiles (LinkedIn, GitHub, personal sites)
- Event Registration: Automated workshop, conference, and webinar submissions
- Activity Tracking: Streamline logging of community contributions and speaking engagements
Future Enhancements
- Expand support for additional form types (file uploads, rich text editors)
- Add multi-browser support (Chrome, Firefox, Edge)
- Implement form template recognition for automatic field detection
- Add support for YAML and JSON data formats alongside Markdown
- Develop browser extension for one-click form automation
- Integrate with CI/CD pipelines for automated form testing
Learning Outcomes
- Built a local-first automation framework prioritizing data privacy and user control
- Integrated Playwright MCP for robust, snapshot-driven browser automation
- Implemented AI-powered fuzzy matching for intelligent form field selection
- Developed adaptive error handling with exponential backoff and retry logic
- Combined structured markdown parsing with LLM content generation
- Learned batch processing patterns for optimizing automation workflows
Links
- GitHub Repository: FormPilot
Note
FormPilot is built for educational and productivity purposes. Users are responsible for ensuring compliance with target website terms of service, respecting rate limits, and using the tool ethically and responsibly.