AIP-16 — Local Inference Profile
Abstract#
This AIP defines the normative standard for local-first AI execution within Gao AI OS.
It establishes:
-
device classification
-
model class abstraction
-
privacy tiers
-
offline guarantees
-
routing constraints
-
policy integrity requirements
This AIP is vendor-neutral.
Design Principle#
Gao AI OS is model-agnostic.
Local inference MUST NOT depend on:
-
specific vendor
-
specific architecture
-
specific training pipeline
Runtime MUST operate on model classes, not model brands.
Model Class Abstraction#
Implementations MUST map concrete models into the following classes.
Model Class
Indicative Size
Intended Use
M-16.0 Nano
~0.5B–2B
Phone-level low latency
M-16.1 Small
~2B–4B
Writing, extraction
M-16.2 Medium
~4B–9B
Planning, RAG
M-16.3 Heavy
9B+
Advanced reasoning
Runtime MUST route by class, not vendor name.
Device Classes#
Device Class
Description
D-16.A
Phone
D-16.B
Tablet
D-16.C
Laptop
D-16.D
Workstation
Routing MUST consider device constraints.
Privacy Tiers#
Tier
Description
P-16.0
Local-Only
P-16.1
Local-Preferred
P-16.2
Remote-Allowed
Policy MUST override routing decisions when required.
Normative Requirements#
-
Local inference MUST respect AIP-02 (Capability).
-
Local inference MUST respect AIP-03 (Policy).
-
Local inference MUST preserve audit events (AIP-11).
-
Secrets MUST NOT enter model context (AIP-05 / AIP-13).
-
Remote fallback MUST be explicit and auditable.
Offline Mode#
Offline mode MUST:
-
continue local inference
-
deny remote-only tasks
-
preserve queued audit logs
Security Considerations#
Local-first reduces:
-
centralized data risk
-
network interception risk
But increases:
- endpoint compromise risk
Implementations SHOULD support:
-
encrypted model storage
-
signed model bundles
-
version pinning
-
revocation
Regulatory Note#
Local inference architecture supports:
-
data minimization
-
privacy-by-design principles
This AIP does not mandate any specific model vendor.