AI Infrastructure · Platform Engineering · London
Architecting resilient platform infrastructure from microVM to edge.
Deep systems expertise for teams running AI workloads in production — BentoML model serving, Firecracker microVMs, NATS JetStream, SeaweedFS storage, Azure Cognitive, and the Platform Materialisation layer beneath it all.
Platform Engineering
Capabilities.
Seven disciplines we take to production — each a deep specialism, each deployable in anger. We build platforms the way we build bridges: with engineering, not aspiration.
01 / 08
BentoML Model Serving
Production inference with autoscaling, A/B testing, and cost-performance tuning.
02 / 08
Firecracker MicroVM Isolation
Per-tenant sandboxing for untrusted workloads at VM speed.
03 / 08
NATS Messaging Fabric
JetStream persistence, KV store, and object store for event-driven systems.
04 / 08
SeaweedFS Storage
Block and object storage, S3-compatible, with a small operational footprint.
06 / 08
Platform Materialisation
Umbrella offering: Kubernetes control planes and Envoy Gateway — the foundation beneath everything above.
07 / 08
Concept Kernel · Ontologies
A Turtle/OWL ontology corpus for sovereign AI kernels — identity, provenance, governance, grounded in BFO and PROV-O.
core.ttlproof.ttlrbac.ttlprocesses.ttlself-improvement.ttl08 / 08
TTS & Voice Cloning
Multi-model speech synthesis (Parler · CosyVoice 2 · F5 · Dia · StyleTTS 2) plus voice cloning, queued on Azure A100 serverless — $0 idle between batches, manifest-driven provenance for every render.
parlercosyvoice2f5diastyletts2In production
Live examples.
Working deployments you can inspect right now — public proof of what our platforms do in anger.
01 / 05
Lab Console
Operator console for the NeuxScience Adaptive Dispatcher — an n-back task-difficulty prediction service. Live BentoML 1.4 deployment you can drive yourself.
02 / 05
Tileformer
A browser-based tile and keyframe animation sequencer for authoring visual experiments in cognitive psychology — 16×12 grid, layers, timeline scrubbing, frame-accurate stimulus composition.
03 / 05
Concept Kernel
A published Turtle/OWL ontology corpus for sovereign AI kernels — BFO-grounded identity, PROV-O provenance, SHACL constraints, ValueFlows economics. Currently v3.5-alpha6.
04 / 05
TTS Render Gallery
30 F5-TTS renders from the A100 serverless batch pipeline — voice-clone reference clips, style axes, cost-per-batch receipts. Local authoring tool; public capability lives on /tts/.
05 / 05
Evolution Engine
Firecracker-backed autonomous self-improving workflows — continuous micro-reasoning passes against LLMs composed from node-based 'concept kernels' (OODA · TEXT · LLM · PICK-1 · FIFO · MULTI). Minimal-token 'caveman' query shape for high-throughput, low-cost iteration.
Proof
Selected engagements.
Neux Ltd has been running platforms for enterprise and public-sector clients since 2014. A selection of concrete outcomes:
Method
How we work.
Every engagement — Assignment, Workshop, Audit, Architecture — follows the same three-part contract. No open-ended retainers, no scope that grows on its own, no invoices for ambiguity.
Scope
One capability, one outcome, one SLA. We agree the boundary before we start and write it into the engagement brief. Out-of-scope asks become a separate, priced follow-up — never absorbed silently.
Timeline
Engagements are time-boxed (1 day / 3 days / 1 week / 2 weeks depending on SKU). You get a calendar with start / end dates and a daily check-in cadence. Delays caused on our side reset the clock; delays on yours pause it.
Deliverables
Every engagement ships a working artefact (running code, a configured cluster, a measured benchmark) plus a written memo you can hand to a colleague. No engagement ends with "it's in Peter's head."
Shop
Nine engagements — each with its own SLA.
Opinionated, packaged offerings. No prices — every engagement ships against a specific, measurable service-level objective. Add to cart to request delivery.
01 / 09
BentoML Model Serving
SLA · p99 < 200ms per inference · autoscales 0–50 replicas · cost-perf tuning
02 / 09
Firecracker MicroVM Isolation
SLA · cold-start < 15s per microVM · < 1% blast radius across tenants
03 / 09
NATS Messaging Fabric
SLA · p99 < 5ms publish-to-deliver · JetStream 3-replica · KV + ObjectStore
04 / 09
SeaweedFS Storage
SLA · p99 < 20ms read · 99.99% object durability · erasure-coded
05 / 09
Azure Cognitive Integration
SLA · p99 < 60ms for Speech-to-Text · cached on SeaweedFS · cost-governed
06 / 09
Kubernetes Control Planes
SLA · etcd quorum recovery < 15s · air-gap installable · multi-region federation
07 / 09
Envoy Gateway (Advanced)
SLA · p99 per-hop overhead < 3ms · mTLS rotation hands-off · policy-driven routing
08 / 09
Platform Materialisation
SLA · umbrella engagement — scope + design + build across our stack
09 / 09
TTS & Voice Cloning Pipeline
SLA · $0 idle · 5-engine dispatch · A100 serverless batching · manifest-driven provenance
Hands-on training
Workshops.
8 one-day hands-on workshops, each mapped to a capability we deliver. Small cohort, real hardware, runbook to take away. Scheduling opens as each is finalised; all 8 are in draft today. See all on the Workshops page →
Hands-on: BentoML on RKE2 — 1-day workshop
Containerise a real model, autoscale it behind Envoy, measure p99 latency and cost-per-request. Ship with a runbook.
Scheduling soon →
Hands-on: Firecracker runners — 1-day workshop
Per-tenant microVM isolation from the 200-LOC primitive through the Ignite-managed CI runner pool. Prove the blast-radius story.
Scheduling soon →
Hands-on: NATS for event-driven systems — 1-day workshop
Document-ingest pipeline on JetStream with idempotent replay, KV state, and the failure-drill playbook end-to-end.
Scheduling soon →
Hands-on: SeaweedFS storage — 1-day workshop
Stand up a 4-node cluster, attach to K8s via the CSI driver, benchmark against Longhorn on the same hardware.
Scheduling soon →
Hands-on: Azure Cognitive in a bespoke platform — 1-day workshop
Wire STT into an Envoy ext_authz edge pipeline with a SeaweedFS cache. Add a Content Safety gate. Measure cost savings honestly.
Scheduling soon →
Hands-on: TTS pipeline on A100 — 1-day workshop
5-engine dispatch on Azure serverless A100. Voice cloning with consent. Manifest-driven provenance for every render.
Scheduling soon →
Hands-on: Air-gapped RKE2 — 1-day workshop
Bootstrap a 3-node RKE2 cluster on a sealed network from a single pre-staged tarball. Bundle pipeline + bootstrap script + CIS-benchmark evidence path.
Scheduling soon →
Hands-on: Envoy Gateway advanced — 2-day workshop
mTLS + Gateway API routing + Redis-backed ratelimit service + ext_authz gates + Grafana panels. Two days, production-shape stack.
Scheduling soon →
Podcast
Coming soon.
A long-form podcast on platform engineering, AI infrastructure, and operating systems in production — migrating across from styk.tv shortly.
Writing
Field notes.
14 long-form write-ups in draft — one per capability drill we run on engagements. Peter reviews and publishes each as it's ready; the list below previews scope. RSS + a sparse email list land with the first published piece; no tracking pixels, readable archive without giving up an address. See all on the Writing page →
Serving a Whisper speech-to-text model
Autoscaling, cost-curve, and handover runbook for a production Whisper deployment on a single-node K3s + L4 GPU.
Draft →
A/B testing two inference servers
Shadow → canary → cutover with Envoy `weighted_clusters`, NATS shadow bus, and auto-rollback abort conditions.
Draft →
Per-tenant microVM sandbox in 200 LOC
Jailer + rootfs + single-TAP isolation, hand-driven, with a blast-radius audit that mirrors the /firecracker/ threat model.
Draft →
Ignite + Firecracker for CI runner isolation
Ephemeral microVM runner pool for GitHub Actions, with cost break-even vs hosted + concurrent-job leak test.
Draft →
Event-driven microservices on NATS JetStream
Document-ingest + OCR + KV + object-store pipeline with idempotent replay drill and single-node-kill failure test.
Draft →
NATS as a service-mesh data plane alternative
5-service demo using NATS `micro` framework + per-service accounts + leaf-node locality. Honest trade-off vs Envoy.
Draft →
S3-compatible storage on a Raspberry Pi cluster
4-Pi topology with master-failover drill, apples-to-apples benchmark vs Longhorn, and erasure-coding recovery.
Draft →
Mount SeaweedFS as Kubernetes persistent volume
CSI driver + Postgres workload + `fio` matrix yielding a choice-tree runbook for SeaweedFS vs Longhorn.
Draft →
Wire Speech-to-Text into an edge pipeline
Envoy `ext_authz` → sha256-keyed SeaweedFS cache → Azure STT on miss. Measured 25-30% Azure cost cut on 30%-repeat workload.
Draft →
Content Safety as a middleware gate
Per-route opt-in Envoy gate → Azure Content Safety. 500-item corpus eval + fail-open vs fail-closed decision record.
Draft →
Air-gapped RKE2 bootstrap in 90 minutes
Pre-staged bundle + bootstrap script that survives a procurement-grade transfer to sealed hosts. From cold iron to `kubectl get nodes` under 90 minutes.
Draft →
Per-API-key rate limiting with a Redis-backed global service
Production-shape global rate limits across N Envoy replicas via the upstream `ratelimit` service + Redis. Includes fail-open vs fail-closed decision drill.
Draft →
Observability triad · Grafana + Mimir + Loki + Tempo
Platform artefact: the default open-source observability baseline every other piece links to.
Draft →
Runbook: rolling back a bad Divi 5 update without the Migrator
Staged upgrade + regression classification + reversible rollback scripts. Applies to our own site.
Draft →
About
The person you're hiring.
Peter Styk — platform architect, Neux Ltd principal since 2014 (LinkedIn). Deep practitioner on microVM isolation, service meshes, distributed storage, and inference serving. Engagements delivered for teams running AI workloads in production; every deliverable comes with runbooks you can run without him. The sister brand neux.ai covers the AI-consultancy work that doesn't belong on a platform-engineering site.
Contact
Book a call.
No sales funnel, no gated download, no tracking pixels. Two or three sentences on the problem is enough for Peter to prep. Replies within 2 working days.
Neux Ltd
AI Infrastructure · Platform Engineering · London.
Since 2014.
Contact
Legal
© 2014–2026 Neux Ltd
Registered in England & Wales.