Six things you'll never
have to figure out yourself.

Every component of a Pro tier deployment is handled by PrivateAI. You receive a working, secure, on-premise AI system — not a parts list.

Step 01
Hardware Selection
Jim recommends the right GPU desktop for your practice size and workload. No guesswork — a specific build that matches your budget and compliance requirements.
Step 02
Model Configuration
Ollama installed and configured with the right Llama model for your hardware. System prompt tuned for your practice type — not a generic assistant.
Step 03
Secure Network Setup
Caddy reverse proxy configured for your local network. Handles SSL, routing, and ensures the AI is only accessible through controlled channels.
Step 04
Remote Access Tunnel
Cloudflare tunnel configured so you can reach your office AI from your phone or laptop, anywhere — with encrypted traffic that even Cloudflare cannot read.
Step 05
Training & Go-Live
A hands-on walkthrough of your system — how to use it, how to prompt it effectively for your workflows, and what to do if something needs attention.
Step 06
Ongoing Support
Monthly subscription includes model updates, security patches, and direct access to Jim. Not a ticket queue — a person who built your system.

Two hardware profiles.
Both fully managed by Jim.

These are reference configurations — Jim will recommend the right build for your specific practice during the consultation. Pricing is estimated; actual hardware costs vary.

Solo Practice
Standard Node
Hardware est. ~$1,100
"Full AI capability for one to three concurrent users. Runs quietly in a closet or under a desk."
GPU RTX 3060 12GB
RAM 32GB DDR5
Storage 1TB NVMe SSD
Models Llama 3.1 8B · 14B
Power ~120W typical load
Form Factor Mini / SFF desktop
Capability
Handles clinical notes, legal drafting, and document review comfortably. Response times of 2–5 seconds for typical queries. Suitable for solo practitioners and small teams.
Regulated Office
Performance Node
Hardware est. ~$1,900
"Larger models, faster responses, multi-user workloads. Built for practices where the AI is used all day."
GPU RTX 4070 Ti Super 16GB
RAM 64GB DDR5 6000MHz
Storage 2TB NVMe SSD
Models Llama 3.1 8B · 32B · 70B
Power ~200W typical load
Form Factor Mid-tower desktop
Capability
Runs 32B models with sub-3-second responses. Handles 3–5 simultaneous users. Recommended for multi-clinician practices, small law firms, or high-volume document workflows.

What runs on your hardware.

Every component is open source, battle-tested, and configured by Jim. No subscriptions to third-party AI services. No data leaving the building.

Ollama
Local model runtime. Manages Llama and other open-source models on your GPU.
Caddy
Reverse proxy and SSL termination. Controls access and handles encrypted routing.
Cloudflare Tunnel
Secure remote access. Encrypted tunnel from your devices back to your office node.
Llama 3.1
Meta's open-source language model. Runs entirely on your hardware — no API calls, no cloud.
Why this stack? Every component has been production-tested across Jim's existing deployments. Ollama handles model management with minimal configuration. Caddy automates SSL with zero friction. Cloudflare tunnel provides remote access without opening firewall ports — a meaningful security advantage for a practice office running 24/7.

Secure Remote Access

Pro tier isn't limited to your office desk. The Cloudflare tunnel gives you encrypted access to your on-premise AI from any device, anywhere — courthouse, hotel, or home.

  • Access from iPhone, Android, laptop, or tablet
  • Encrypted tunnel — traffic is unreadable in transit
  • No open firewall ports on your office network
  • Works on public Wi-Fi and 5G without added risk
  • Your data still never leaves your office hardware
  • Jim monitors tunnel health as part of your subscription
Day 1
Consultation Call
20 minutes with Jim. Understand your practice, your workflows, and your hardware situation. Hardware recommendation delivered same day.
Days 2–7
Hardware Procurement
You order the recommended hardware (or Jim ships a pre-configured unit). Typical delivery 3–5 business days depending on location.
Days 7–10
Remote Configuration
Jim connects remotely and installs the full stack — Ollama, model, Caddy, tunnel. Typically 2–4 hours of work on his end.
Day 10–12
Training & Go-Live
A 60-minute walkthrough of your system. You're live and using it for real workflows by end of week two in most cases.

What practices ask before
they commit to Pro tier.

Do I need any technical knowledge to use this?
No. The system is configured to work like any other chat interface — you type, it responds. Jim handles everything technical. If something needs attention, you contact Jim and he resolves it remotely. The only thing you need to do is turn the computer on.
What happens if the hardware fails?
Jim monitors your node remotely and will often detect issues before you notice them. For hardware failures, Jim will guide you through the recovery process and help you get back online quickly. Standard desktop hardware is widely available and replaceable — there are no proprietary components that create long-term lock-in.
Can the AI be customized for my specific practice?
Yes — and this is one of the significant advantages of Pro tier. The system prompt is fully customized during setup. A medical practice gets an AI tuned to clinical workflows. A law firm gets one tuned to legal drafting conventions. This isn't a generic assistant — it's configured for how you actually work.
Is this actually HIPAA compliant?
On-premise deployment means patient data is processed exclusively on hardware you control, in your facility. There is no third-party AI processor involved — which eliminates the BAA requirement that makes cloud AI problematic for HIPAA-covered entities. That said, compliance involves more than your AI tool — PrivateAI strongly recommends consulting with a healthcare compliance attorney for your specific situation. We handle the technical architecture; your attorney confirms the compliance framework.
How much electricity does this use?
The Standard Node draws approximately 80–120W under typical AI load — comparable to a gaming console. At average US electricity rates running 24/7, this adds roughly $7–$12/month to your electricity bill. The Performance Node draws somewhat more, but both are far less than the power consumption of most office copiers.
What if I want to cancel?
There's no long-term contract — monthly subscriptions cancel at any time. The hardware is yours. The software stack (Ollama, Caddy, Cloudflare tunnel) is all open source. If you cancel, your system continues to run — you simply take over its management. Jim can provide documentation for self-management on request.

Ready to see what's right
for your practice?

A free 20-minute consultation with Jim. He'll tell you exactly what hardware you need, what it will cost, and how long deployment takes — before you commit to anything.