"What's my accounts receivable right now?" It's a simple question, but for most small business owners, answering it requires logging into QuickBooks, navigating to the right report, setting the date range, and interpreting the output. That's a 2-minute process at minimum — assuming you know where to look. We built a voice assistant that answers that question in 3 seconds. You speak, it queries QuickBooks, and it speaks back: "Your current accounts receivable is $24,350 across 12 outstanding invoices. The oldest is 47 days overdue from Acme Corp for $3,200."
The Problem: Accounting Software Wasn't Built for Business Owners
QuickBooks is powerful software. It's also complex software designed primarily for bookkeepers and accountants. The business owners we work with — people running venues, managing properties, operating restaurants — use QuickBooks because they have to, not because they want to. They need answers to straightforward questions: How much money came in this week? Did invoice #1042 get paid? What did we spend on supplies this month? How much do we owe vendors? But getting those answers requires navigating a UI designed for accounting professionals.
We saw an opportunity to put a natural language layer on top of QuickBooks — an interface where the user doesn't need to know which report to run or which menu to navigate. They just ask their question in plain English and get an answer.
The NLP Pipeline
The system's intelligence lives in its natural language processing pipeline, which takes a spoken or typed query and translates it into a structured QuickBooks API call. The pipeline has four stages.
Stage 1: Speech-to-Text. Voice input is captured via the device microphone and sent to a speech recognition service for transcription. We use Whisper for its accuracy with business terminology, numbers, and proper nouns (company names, vendor names). The raw transcript is cleaned — filler words removed, numbers normalized, and company names matched against the client's QuickBooks contact list using fuzzy matching.
Stage 2: Intent Classification. The cleaned transcript is analyzed to determine what the user wants to do. We defined 14 intent categories: check balance, view receivables, view payables, send invoice, check invoice status, record payment, view expenses by category, view profit/loss, check cash flow, list overdue invoices, create estimate, record expense, search transactions, and compare periods. A fine-tuned classification model maps the natural language query to the correct intent with 94% accuracy in production.
Stage 3: Entity Extraction. Once we know the intent, we extract the relevant parameters. "How much did we spend on supplies this month?" has intent view_expenses_by_category with entities category: supplies and period: current_month. "Send an invoice to Acme Corp for $5,000" has intent send_invoice with entities customer: Acme Corp and amount: 5000. The entity extractor handles relative dates ("last month," "this quarter," "past 30 days"), fuzzy company name matching, and implicit defaults (if no date range is specified, assume current month).
Stage 4: API Translation and Response. The classified intent and extracted entities are translated into QuickBooks API calls. The response data is then formatted into natural language by a response generation layer. Raw numbers become conversational summaries. A list of 12 overdue invoices becomes "You have 12 overdue invoices totaling $18,400. The three largest are..." The response is delivered as both text and synthesized speech using text-to-speech.
Safety Rails for Financial Operations
Read operations (checking balances, viewing reports) execute immediately. Write operations (sending invoices, recording payments) require explicit confirmation. When a user says "Send an invoice to Acme Corp for $5,000," the system responds: "I'll create an invoice to Acme Corp for $5,000. Should I send it now?" Only a "yes" or "confirm" response triggers the action. This prevents costly mistakes from misheard commands.
We also built role-based access: business owners get full access, managers can view reports but can't create invoices above a threshold, and employees can only check their own time entries. Every interaction is logged with the user, timestamp, query, and action taken for audit purposes.
Real-World Usage Patterns
After deploying the system to three clients, clear usage patterns emerged. The most common queries are balance checks (28%), receivables status (22%), expense lookups (18%), and invoice status checks (15%). Peak usage is early morning (owners checking yesterday's numbers over coffee) and late evening (end-of-day review). The average session is 3-4 queries, suggesting users are doing quick check-ins rather than deep accounting work — exactly the use case we designed for.
One client told us: "I used to avoid looking at my books because QuickBooks felt overwhelming. Now I check my numbers three times a day because it's as easy as asking a question." That's the outcome we were after — not replacing QuickBooks, but making it accessible to the people who own the data.