Ferment Engine

Perception × and prediction engine.

Native media perception runtime — audio, visual, language, search — for custom pipelines. Live on mobile, deep on server.

Born backstage.

It started there: near zero-latency perceptor for live shows. The stage taught machines to listen.

Signals, by family.

Audio

Beat, downbeat and tempo grid
Structure, instruments, highlights, genres
Speech, singing and music segments
Open-vocabulary stems
Drop, breakdown and energy prediction

Visual

Shot and scene boundaries
Face detection, identity, clustering, masking
Open-vocabulary object detection, tracking, masking
Optical flow and motion energy
Saliency

Language

Transcription
Understanding

Search

Multimodal semantic search
Cross-encoder reranking
Unified index

Proof,
in production.

Everything Cuts does begins here — every beat found, every word timed, every scene cut. Engine isn't a roadmap. It ships, every day, inside an app you can hold.

Get the app on iOS

See Cuts

Cuts — production build

Perception × and prediction engine.

Born backstage.

Signals, by family.

Proof, in production.

Engineering your ownmedia pipeline?

Proof,
in production.

Engineering your own
media pipeline?