Bouwen in het openbaar · Fase 1 in uitvoering

Roadmap

Realtime status van elk EULLM-component, de mijlpalen die we bereiken en een volledige geschiedenis van elk uitgebracht release.

GitHub Releases Bekijk broncode

Componentstatus

Platformoverzicht

Productierijp

EULLM Engine

v0.6.2

Rust-inferentieruntime. Multimodaal vision + audio, drop-in vervanging voor Ollama met OpenAI-compatibele API en ingebouwde chat-UI op localhost:11435.

Voortgang88%

259 tok/s

Doorvoer

Vision+Audio

Multimodaal

✓ getest

Windows

In ontwikkeling

EULLM Forge

Model-verticalisatiepipeline. Componenten gereed, end-to-end CLI-integratie in uitvoering.

Voortgang42%

30B→7B

Groottereductie

GGUF

Export

Beta

Pipeline

Preview

EULLM Hub

EU-gehost modelregister met AI Act-compliancekaarten. Operationeel als prototype.

Voortgang25%

Prototype

Modellen

3 gepland

Sectoren

Alleen EU

Hosting

Engine-mogelijkheden — v0.6.2

Rust-runtime · continue batching · multimodaal vision + audio · volledig lokaal op consumenten-GPU's

259 tok/s

Doorvoer

16 gelijktijdige verzoeken

Vision+Audio

Multimodaal

OCR, scène, transcriptie

~2-4×

Gekwantiseerde KV

context, Q4_0/Q5/Q8

--web

Webbrowsen

modelagnostisch, elke GGUF

Ontwikkelingsfasen

Wat we bouwen

01Huidig

Fase 01 — Fundament

Q1 2026

Kern-inferentie-engine bereikt productiekwaliteit. Forge-pipelinecomponenten gebouwd. Hub operationeel als prototype.

11/13 items85%

Engine: standalone binaries (Linux x64, Windows x64)
Multimodaal vision + audio (Gemma 4)
Continue batching — 259 tok/s
Gekwantiseerde KV-cache — Q4_0/Q5/Q8 (~2-4× context)
OpenAI-compatibele + Ollama drop-in API
GPU: CUDA (getest), ROCm, Vulkan, Metal
Ingebouwde auditlogging EU AI Act
Transparant webbrowsen (--web, modelagnostisch)
Interactieve REPL: /temp, /maxtokens, /system
Ingebouwde chat-UI — localhost:11435, ~29 KB in binair bestand
Forge: structureel snoeien + kennisdestillatie
Forge end-to-end pipeline CLI
Demonstratiemodel: legal-it-7b

02Gepland

Fase 02 — Ecosysteem

Q2 2026

Eerste productieklare Hub-modellen gaan live. Stabiele Forge CLI. Uitgebreide platformondersteuning.

1/8 items13%

Hub: model voor juridische sector (EU/Italiaans recht)
Hub: model voor medische triage-ondersteuning
Hub: model voor Finance & KYC-compliance
AI Act-compliancekaarten voor alle Hub-modellen
Forge: stabiele CLI + volledige documentatie
Windows x64-ondersteuning
Multi-GPU-inferentie
Kwantiseringswizard voor consumentenhardware

03Toekomst

Fase 03 — Enterprise

H2 2026

Enterprise-verharding: gedistribueerde inferentie, toegangsbeheer, visuele Forge Studio-interface.

0/7 items0%

Multi-node gedistribueerde inferentie
Kubernetes-operator
SSO / RBAC-toegangsbeheer
Forge Studio — visuele fijnafstemmings-UI
Modelversioning & rollback in Hub
Gecertificeerde EU-datacentrumpartnerschappen
SLA-ondersteuningsniveaus

Changelog

Releasegeschiedenis

v0.6.2Nieuwste9 Jun 2026

Multimodal in the Chat UI — drop in an image or audio clip, fully local
Vision + audio understanding stable (Gemma 4): OCR, scene description, transcription
BOS token handling fix for multimodal prompts

v0.6.07 Jun 2026

Multimodal vision launched — image OCR and scene description on consumer GPUs
Audio understanding (experimental, CLI) — transcription and in-content search
Runs fully local, zero telemetry

v0.5.206 Jun 2026

Math expression rendering in the Chat UI
Quantized KV cache — Q4_0/Q5/Q8 for ~2-4× context on the same GPU

v0.5.331 May 2026

Embedded chat UI on localhost:11435 — ~29 KB in binary, zero CDN or external dependencies
eullm -V now shows the active backend variant
Standalone Windows binaries: CPU and CUDA

v0.4.427 May 2026

Web tool calling — transparent URL fetching in conversation
Legal-IT dataset preparation module
GPU layer fitting improvements

v0.4.38 Apr 2026

Drop-in Ollama replacement with continuous batching
Quantized KV cache for larger context on 16 GB GPUs
Transparent web browsing without function-call overhead
EU AI Act audit logging built-in

v0.3.136 Apr 2026

Interactive REPL: /temp, /maxtokens, /system commands
Quantized KV cache quality/accuracy automatic recommendations

v0.3.105 Apr 2026

Quantized KV cache math accuracy improvements
1% accuracy loss isolated to matrix operations only

v0.3.53 Apr 2026

Default context window increased to 2 048 tokens
Math accuracy benchmarking suite added

v0.3.31 Apr 2026

Mixed KV cache type support

v0.3.230 Mar 2026

Bug fixes
Documentation updates

v0.2.9829 Mar 2026

Batch scheduler refinements
Build pipeline stabilization

Bekijk alle releases op GitHub →

Vorm de roadmap

Open een issue, stem op functies of draag code bij. EULLM wordt in het openbaar gebouwd en elke stem telt.

Open een issue Doe mee aan de discussie