Anima Core

Technical FAQ

Direct answers to technical questions about AN1 scope, validation, and production readiness.

Does AN1 actually run inference in production?

Yes. All demo requests execute against the live AN1 API, return real model outputs, and are logged in the partner portal with request IDs, latency, and cost metrics.

Is this a mock or simulated savings calculation?

No. Baseline cost and AN1 cost are calculated from real execution paths. Savings reflect the difference between full-model baseline execution and AN1-optimized execution for the same workload.

What workloads does AN1 currently support?

AN1 currently supports classification-style workloads using compressed representations (e.g., SST-2 sentiment classification). Additional task packs are added incrementally during pilots.

Does AN1 replace the base model?

No. AN1 operates downstream of representation extraction, not as a wholesale model replacement. It reduces inference cost and latency for supported workloads while preserving task-level accuracy.

How is accuracy validated?

For supported tasks, AN1 produces logits, probabilities, and class decisions that are directly comparable to baseline outputs. Accuracy is evaluated at the task level, not by token-level generation matching.

Is generative output supported?

Not in the current local backend. Generative or text-to-representation workflows require an HTTP backend with a text-to-z extractor and are introduced selectively in pilots.

Is AN1 production-ready?

AN1 is pilot-ready. It is deployed, authenticated, metered, logged, and actively producing savings in live environments. Full production rollout depends on workload fit and pilot validation.

How is pricing calculated?

Partners pay a percentage of realized savings, not a flat platform fee. Baseline spend, AN1 spend, and savings are visible in the partner portal.

Can this be tested safely?

Yes. Pilots run alongside existing systems, require no model retraining, and can be disabled at any time.

What is not claimed?

AN1 does not claim universal model compression, full generative replacement, or zero-loss accuracy across all tasks. Claims are scoped to validated workloads.