Skip to main content
Back to AI Explorations

Behind the Build: ConnectOnion Mail Agent – Voice, Intelligence & Relationship Tracking

Building on top of an existing framework teaches you things that building from scratch never will.

I extended ConnectOnion's email agent reference implementation with four unique features: Contact Intelligence, Voice Email, Relationship Tracking, and Weekly Analytics.


The Context

ConnectOnion provides an agent framework for building AI assistants. Their reference implementation handles basic Gmail operations – reading inbox, sending emails, managing labels.

I wanted to demonstrate framework extension skills for an application demo. Rather than clone and modify, I built features that showcased what the framework enables when you understand its patterns.


The Four Features

1. Contact Intelligence

Before emailing someone important, you should know who they are. The research_contact() function fetches their company website and uses AI to generate talking points.

class ContactIntel(BaseModel):
    company: str
    role_guess: Optional[str] = None
    industry: str
    talking_points: List[str]
    tone_suggestion: str
    recent_news: Optional[str] = None

The flow:

  1. Extract domain from email address
  2. Skip personal domains (gmail.com, yahoo.com, etc.)
  3. Fetch company website using ConnectOnion's WebFetch
  4. Extract social links
  5. Use llm_do() with Pydantic model for structured output
  6. Store in memory for future reference

This took ~50 lines. The framework's WebFetch and llm_do() did the heavy lifting.


2. Voice Email

Dictate emails on the go. Record a voice memo, the agent transcribes and drafts.

def voice_to_email(audio_file: str, recipient_hint: str = "") -> str:
    # 1. Validate audio format
    valid_formats = {".wav", ".mp3", ".aiff", ".aac", ".ogg", ".flac", ".m4a"}
    
    # 2. Transcribe using Gemini
    transcript = transcribe(
        audio=str(audio_path),
        prompt="Email dictation. The speaker is dictating an email to send.",
    )
    
    # 3. Extract intent (recipient, subject, tone)
    intent = llm_do(intent_prompt, output=EmailIntent, temperature=0.2)
    
    # 4. Research recipient if corporate domain
    if recipient and "@" in recipient:
        domain = recipient.split("@")[1]
        if domain not in personal_domains:
            contact_context = research_contact(recipient)
    
    # 5. Draft email using all context
    draft = llm_do(draft_prompt, output=EmailDraft, temperature=0.4)

The key insight: voice emails benefit from contact intelligence. If you're dictating an email to someone at a company, the agent automatically researches them before drafting.

I hit one gotcha: OGG/Opus format from iOS voice memos caused 500 errors with Gemini. The solution was documenting MP3 as the recommended format.


3. Relationship Tracking

Relationships decay if you don't maintain them. The do_relationships() function analyzes when you last contacted people and flags those needing attention.

def do_relationships() -> str:
    # Extract contacts from memory
    for line in str(memory_content).split("\n"):
        if "contact:" in line.lower():
            # Parse email and last contact date
            
    # Categorize by engagement health
    for email, last_contact in contacts.items():
        days_ago = (now - last_contact).days
        if days_ago > 14:
            critical.append((email, days_ago))  # 🔴
        elif days_ago > 7:
            warning.append((email, days_ago))   # 🟡
        else:
            healthy.append((email, days_ago))   # 🟢

The thresholds are opinionated: 14+ days is critical, 7-14 is warning, under 7 is healthy. These work for active networking; you'd adjust for different contexts.


4. Weekly Analytics

do_weekly() aggregates your email activity and generates AI recommendations.

def do_weekly() -> str:
    week_ago = (datetime.now() - timedelta(days=7)).strftime("%Y/%m/%d")
    
    received = gmail.search_emails(query=f"after:{week_ago} -in:sent", max_results=100)
    sent = gmail.search_emails(query=f"after:{week_ago} in:sent", max_results=100)
    unread = gmail.search_emails(query="is:unread", max_results=50)
    
    # Count emails from output (lines containing @ or starting with digit)
    
    recommendation = llm_do(
        f"Email productivity analysis: {stats}. Provide a brief actionable recommendation.",
        model="co/gemini-2.5-flash",
    )

The recommendation comes from Gemini Flash – fast and cheap for a one-liner insight.


The Plugin Architecture

ConnectOnion supports plugins via decorators. I built three:

Approval Workflow

Never send an email without confirmation:

from connectonion import before_each_tool

def require_send_approval(agent):
    pending = agent.current_session.get("pending_tool", {})
    tool_name = pending.get("name", "")
    
    if tool_name in ["send_email", "reply_to_email", ...]:
        # Show preview panel
        console.print(Panel(f"**To:** {to}\n**Subject:** {subject}\n\n{body_preview}"))
        
        # Ask for confirmation
        confirmed = Confirm.ask("Send this email?", default=False)
        if not confirmed:
            raise RuntimeError("Email cancelled by user")

approval_workflow = [before_each_tool(require_send_approval)]

The before_each_tool decorator intercepts every tool call. I check if it's a send operation and require confirmation.

Email Insights

Add AI analysis after reading emails:

from connectonion import after_tools

class EmailInsight(BaseModel):
    priority_level: str  # urgent, high, normal, low
    action_needed: bool
    key_topics: List[str]
    sentiment: str
    suggested_action: Optional[str] = None

def add_email_insights(agent):
    last_result = agent.current_session.get("last_result", "")
    
    insight = llm_do(
        f"Analyze this email...",
        output=EmailInsight,
        model="co/gemini-2.5-flash",
    )
    
    # Display with emoji indicators
    console.print(f"{priority_emoji} Priority: {insight.priority_level.upper()}")

email_insights_plugin = [after_tools(add_email_insights)]

The after_tools decorator runs after tool execution. I analyze the result if it looks like email content.

Agent Visibility

Show workflow summaries after tasks complete:

from connectonion import on_complete

def show_workflow_summary(agent):
    session = agent.current_session
    tool_calls = session.get("tool_call_count", 0)
    delegations = session.get("delegation_count", 0)
    
    console.print(Panel(" • ".join(summary), title="📊 Workflow Summary"))

agent_visibility_plugin = [on_complete(show_workflow_summary)]

This gives users transparency into what the agent did – how many tools it called, how long it took.


The CLI/TUI Architecture

The agent runs in two modes:

CLI Mode (Typer):

email-agent inbox --count 10 --unread
email-agent research alice@acme.com
email-agent voice memo.mp3 --to bob@company.com
email-agent relationships
email-agent weekly

Interactive TUI Mode (ConnectOnion's Chat):

chat = Chat(
    agent=agent,
    triggers={
        "/": COMMANDS,  # Slash commands
        "@": contacts,  # Contact autocomplete
    },
    hints=["/ commands", "@ contacts", "Enter send", "Ctrl+D quit"],
)

chat.command("/research", lambda text: do_research(text[9:].strip()))
chat.command("/voice", _voice)
chat.command("/relationships", lambda _: do_relationships())

The TUI provides autocomplete for commands and contacts. Type / to see all commands, @ to mention contacts.


The System Prompt

The agent's behavior is defined in prompts/agent.md (311 lines). Key principle:

NEVER ask questions before using tools. ALWAYS use tools first to gather information, then propose complete solutions.

The prompt explicitly forbids reactive behavior:

**NEVER say:**
- "What time works for you?"
- "What should the meeting be about?"
- "What do you want to say?"

**ALWAYS propose complete solutions:**
For meetings:
"I checked your calendar - you're free tomorrow 9-11am. Based on your 
emails with X about [topic], I suggest '[Topic] Sync' tomorrow at 9am. Book it?"

This makes the agent proactive. Instead of asking what to do, it gathers context and proposes actions.


What I Learned About Framework Extension

  1. Read the framework's patterns first. ConnectOnion uses Pydantic for structured output, decorators for plugins, session state for tool context. Once you understand these patterns, extension is straightforward.

  2. Leverage what exists. WebFetch, Memory, llm_do(), transcribe() – I didn't reimplement any of these. The unique value was in the composition.

  3. Plugins are the extension point. The three hooks (before_each_tool, after_tools, on_complete) covered every use case. Good frameworks make extension obvious.

  4. Test at the boundary. 35 tests cover my code, not the framework's. I test that my functions return expected formats, not that Gmail sends emails.


Stats

MetricValue
Python files17
Tests35
Custom plugins3
Unique features4
CLI commands16

The Takeaway

Building from scratch proves you can build. Extending a framework proves you can collaborate with existing code – which is what most professional work looks like.

The ConnectOnion framework handled OAuth, Gmail API, Calendar API, LLM orchestration, and TUI rendering. I focused on the composition layer – combining these primitives into features that solve real problems.

That's the skill that matters in production: knowing when to build and when to extend.


Try It

Leave a Comment

Comments (0)

Be the first to comment on this post.

Comments are approved automatically.