AIP-16 — Local Inference Profile

Abstract#

This AIP defines the normative standard for local-first AI execution within Gao AI OS.

It establishes:

device classification
model class abstraction
privacy tiers
offline guarantees
routing constraints
policy integrity requirements

This AIP is vendor-neutral.

Design Principle#

Gao AI OS is model-agnostic.

Local inference MUST NOT depend on:

specific vendor
specific architecture
specific training pipeline

Runtime MUST operate on model classes, not model brands.

Model Class Abstraction#

Implementations MUST map concrete models into the following classes.

Model Class

Indicative Size

Intended Use

M-16.0 Nano

~0.5B–2B

Phone-level low latency

M-16.1 Small

~2B–4B

Writing, extraction

M-16.2 Medium

~4B–9B

Planning, RAG

M-16.3 Heavy

9B+

Advanced reasoning

Runtime MUST route by class, not vendor name.

Device Classes#

Device Class

Description

D-16.A

Phone

D-16.B

Tablet

D-16.C

Laptop

D-16.D

Workstation

Routing MUST consider device constraints.

Privacy Tiers#

Tier

Description

P-16.0

Local-Only

P-16.1

Local-Preferred

P-16.2

Remote-Allowed

Policy MUST override routing decisions when required.

Normative Requirements#

Local inference MUST respect AIP-02 (Capability).
Local inference MUST respect AIP-03 (Policy).
Local inference MUST preserve audit events (AIP-11).
Secrets MUST NOT enter model context (AIP-05 / AIP-13).
Remote fallback MUST be explicit and auditable.

Offline Mode#

Offline mode MUST:

continue local inference
deny remote-only tasks
preserve queued audit logs

Security Considerations#

Local-first reduces:

centralized data risk
network interception risk

But increases:

endpoint compromise risk

Implementations SHOULD support:

encrypted model storage
signed model bundles
version pinning
revocation

Regulatory Note#

Local inference architecture supports:

data minimization
privacy-by-design principles

This AIP does not mandate any specific model vendor.