Hardware & Deployment · Pro Tier

We handle the hardware.
You handle your practice.

Pro tier deployment is fully managed by Jim. From hardware selection through ongoing support, you never need to think about the infrastructure — only the results.

What PrivateAI Handles For You

✓ Hardware selection and procurement guidance
✓ Ollama installation and model configuration
✓ Caddy reverse proxy setup and SSL
✓ Cloudflare tunnel for secure remote access
✓ Initial training session for your team
✓ Ongoing model updates and security patches
✓ Remote monitoring and support

Managed Service

Six things you'll never
have to figure out yourself.

Every component of a Pro tier deployment is handled by PrivateAI. You receive a working, secure, on-premise AI system — not a parts list.

Step 01

Hardware Selection

Jim recommends the right GPU desktop for your practice size and workload. No guesswork — a specific build that matches your budget and compliance requirements.

Step 02

Model Configuration

Ollama installed and configured with the right Llama model for your hardware. System prompt tuned for your practice type — not a generic assistant.

Step 03

Secure Network Setup

Caddy reverse proxy configured for your local network. Handles SSL, routing, and ensures the AI is only accessible through controlled channels.

Step 04

Remote Access Tunnel

Cloudflare tunnel configured so you can reach your office AI from your phone or laptop, anywhere — with encrypted traffic that even Cloudflare cannot read.

Step 05

Training & Go-Live

A hands-on walkthrough of your system — how to use it, how to prompt it effectively for your workflows, and what to do if something needs attention.

Step 06

Ongoing Support

Monthly subscription includes model updates, security patches, and direct access to Jim. Not a ticket queue — a person who built your system.

Reference Configurations

Two hardware profiles.
Both fully managed by Jim.

These are reference configurations — Jim will recommend the right build for your specific practice during the consultation. Pricing is estimated; actual hardware costs vary.

Solo Practice

Standard Node

Hardware est. ~$1,100

"Full AI capability for one to three concurrent users. Runs quietly in a closet or under a desk."

GPU RTX 3060 12GB

RAM 32GB DDR5

Storage 1TB NVMe SSD

Models Llama 3.1 8B · 14B

Power ~120W typical load

Form Factor Mini / SFF desktop

Capability

Handles clinical notes, legal drafting, and document review comfortably. Response times of 2–5 seconds for typical queries. Suitable for solo practitioners and small teams.

Regulated Office

Performance Node

Hardware est. ~$1,900

"Larger models, faster responses, multi-user workloads. Built for practices where the AI is used all day."

GPU RTX 4070 Ti Super 16GB

RAM 64GB DDR5 6000MHz

Storage 2TB NVMe SSD

Models Llama 3.1 8B · 32B · 70B

Power ~200W typical load

Form Factor Mid-tower desktop

Capability

Runs 32B models with sub-3-second responses. Handles 3–5 simultaneous users. Recommended for multi-clinician practices, small law firms, or high-volume document workflows.

Software Stack

What runs on your hardware.

Every component is open source, battle-tested, and configured by Jim. No subscriptions to third-party AI services. No data leaving the building.

⬡

Ollama

Local model runtime. Manages Llama and other open-source models on your GPU.

◈

Caddy

Reverse proxy and SSL termination. Controls access and handles encrypted routing.

⬢

Cloudflare Tunnel

Secure remote access. Encrypted tunnel from your devices back to your office node.

◎

Llama 3.1

Meta's open-source language model. Runs entirely on your hardware — no API calls, no cloud.

Why this stack? Every component has been production-tested across Jim's existing deployments. Ollama handles model management with minimal configuration. Caddy automates SSL with zero friction. Cloudflare tunnel provides remote access without opening firewall ports — a meaningful security advantage for a practice office running 24/7.

Secure Remote Access

Pro tier isn't limited to your office desk. The Cloudflare tunnel gives you encrypted access to your on-premise AI from any device, anywhere — courthouse, hotel, or home.

✓ Access from iPhone, Android, laptop, or tablet
✓ Encrypted tunnel — traffic is unreadable in transit
✓ No open firewall ports on your office network
✓ Works on public Wi-Fi and 5G without added risk
✓ Your data still never leaves your office hardware
✓ Jim monitors tunnel health as part of your subscription

Deployment Timeline

Day 1

Consultation Call

20 minutes with Jim. Understand your practice, your workflows, and your hardware situation. Hardware recommendation delivered same day.

Days 2–7

Hardware Procurement

You order the recommended hardware (or Jim ships a pre-configured unit). Typical delivery 3–5 business days depending on location.

Days 7–10

Remote Configuration

Jim connects remotely and installs the full stack — Ollama, model, Caddy, tunnel. Typically 2–4 hours of work on his end.

Day 10–12

Training & Go-Live

A 60-minute walkthrough of your system. You're live and using it for real workflows by end of week two in most cases.

Common Questions

What practices ask before
they commit to Pro tier.

Do I need any technical knowledge to use this? ▼

No. The system is configured to work like any other chat interface — you type, it responds. Jim handles everything technical. If something needs attention, you contact Jim and he resolves it remotely. The only thing you need to do is turn the computer on.

What happens if the hardware fails? ▼

Jim monitors your node remotely and will often detect issues before you notice them. For hardware failures, Jim will guide you through the recovery process and help you get back online quickly. Standard desktop hardware is widely available and replaceable — there are no proprietary components that create long-term lock-in.

Can the AI be customized for my specific practice? ▼

Yes — and this is one of the significant advantages of Pro tier. The system prompt is fully customized during setup. A medical practice gets an AI tuned to clinical workflows. A law firm gets one tuned to legal drafting conventions. This isn't a generic assistant — it's configured for how you actually work.

Is this actually HIPAA compliant? ▼

On-premise deployment means patient data is processed exclusively on hardware you control, in your facility. There is no third-party AI processor involved — which eliminates the BAA requirement that makes cloud AI problematic for HIPAA-covered entities. That said, compliance involves more than your AI tool — PrivateAI strongly recommends consulting with a healthcare compliance attorney for your specific situation. We handle the technical architecture; your attorney confirms the compliance framework.

How much electricity does this use? ▼

The Standard Node draws approximately 80–120W under typical AI load — comparable to a gaming console. At average US electricity rates running 24/7, this adds roughly $7–$12/month to your electricity bill. The Performance Node draws somewhat more, but both are far less than the power consumption of most office copiers.

What if I want to cancel? ▼

There's no long-term contract — monthly subscriptions cancel at any time. The hardware is yours. The software stack (Ollama, Caddy, Cloudflare tunnel) is all open source. If you cancel, your system continues to run — you simply take over its management. Jim can provide documentation for self-management on request.

Ready to see what's right
for your practice?

A free 20-minute consultation with Jim. He'll tell you exactly what hardware you need, what it will cost, and how long deployment takes — before you commit to anything.

Schedule a Free Consultation Ask ARIA First →

We handle the hardware.You handle your practice.

Six things you'll neverhave to figure out yourself.

Two hardware profiles.Both fully managed by Jim.