Sviluppiamo in pubblico · Fase 1 in corso

Roadmap

Stato in tempo reale di ogni componente EULLM, le milestone che stiamo centrando e la cronologia completa di ogni release pubblicata.

Release su GitHub Vedi il codice

Stato componenti

Panoramica piattaforma

Pronto per la produzione

EULLM Engine

v0.6.2

Runtime di inferenza in Rust. Multimodale visione + audio, sostituto drop-in di Ollama con API compatibile OpenAI e chat UI integrata su localhost:11435.

Avanzamento88%

259 tok/s

Throughput

Visione+Audio

Multimodale

✓ testato

Windows

In sviluppo

EULLM Forge

Pipeline di verticalizzazione modelli. Componenti pronti, integrazione CLI end-to-end in corso.

Avanzamento42%

30B→7B

Riduzione

GGUF

Export

Beta

Pipeline

Anteprima

EULLM Hub

Registro modelli ospitato nell'UE con schede di conformità AI Act. Operativo come prototipo.

Avanzamento25%

Prototipo

Modelli

3 previsti

Settori

Solo UE

Hosting

Capacità Engine — v0.6.2

Runtime Rust · continuous batching · multimodale visione + audio · tutto in locale su GPU consumer

259 tok/s

Throughput

16 richieste concorrenti

Vision+Audio

Multimodale

OCR, scene, trascrizione

~2-4×

Quantized KV

context, Q4_0/Q5/Q8

--web

Web browsing

model-agnostic, ogni GGUF

Fasi di sviluppo

Cosa stiamo costruendo

01In corso

Fase 01 — Foundation

Q1 2026

Il motore di inferenza raggiunge la qualità di produzione. Componenti della pipeline Forge costruiti. Hub operativo come prototipo.

11/13 elementi85%

Engine: binari standalone (Linux x64, Windows x64)
Multimodale visione + audio (Gemma 4)
Continuous batching — 259 tok/s
Quantized KV cache — Q4_0/Q5/Q8 (~2-4× context)
API compatibile OpenAI + drop-in Ollama
GPU: CUDA (testato), ROCm, Vulkan, Metal
Audit logging EU AI Act integrato
Web browsing trasparente (--web, model-agnostic)
REPL interattivo: /temp, /maxtokens, /system
Chat UI integrata — localhost:11435, ~29 KB nel binario
Forge: structural pruning + knowledge distillation
Forge: CLI end-to-end pipeline
Modello demo: legal-it-7b

02Pianificata

Fase 02 — Ecosystem

Q2 2026

I primi modelli Hub pronti per la produzione vanno live. CLI Forge stabile. Supporto piattaforma ampliato.

1/8 elementi13%

Hub: modello settore legale (diritto UE/italiano)
Hub: modello supporto triage medico
Hub: modello conformità finanza e KYC
Schede conformità AI Act per tutti i modelli Hub
Forge: CLI stabile + documentazione completa
Supporto Windows x64
Inferenza multi-GPU
Wizard quantizzazione per hardware consumer

03Futura

Fase 03 — Enterprise

H2 2026

Hardening enterprise: inferenza distribuita, controllo accessi, Forge Studio UI visuale.

0/7 elementi0%

Inferenza distribuita multi-nodo
Kubernetes operator
SSO / RBAC access control
Forge Studio — UI visuale per il fine-tuning
Versionamento e rollback modelli in Hub
Partnership con data center EU certificati
Livelli di supporto SLA

Changelog

Cronologia release

v0.6.2Ultima9 Jun 2026

Multimodal in the Chat UI — drop in an image or audio clip, fully local
Vision + audio understanding stable (Gemma 4): OCR, scene description, transcription
BOS token handling fix for multimodal prompts

v0.6.07 Jun 2026

Multimodal vision launched — image OCR and scene description on consumer GPUs
Audio understanding (experimental, CLI) — transcription and in-content search
Runs fully local, zero telemetry

v0.5.206 Jun 2026

Math expression rendering in the Chat UI
Quantized KV cache — Q4_0/Q5/Q8 for ~2-4× context on the same GPU

v0.5.331 May 2026

Embedded chat UI on localhost:11435 — ~29 KB in binary, zero CDN or external dependencies
eullm -V now shows the active backend variant
Standalone Windows binaries: CPU and CUDA

v0.4.427 May 2026

Web tool calling — transparent URL fetching in conversation
Legal-IT dataset preparation module
GPU layer fitting improvements

v0.4.38 Apr 2026

Drop-in Ollama replacement with continuous batching
Quantized KV cache for larger context on 16 GB GPUs
Transparent web browsing without function-call overhead
EU AI Act audit logging built-in

v0.3.136 Apr 2026

Interactive REPL: /temp, /maxtokens, /system commands
Quantized KV cache quality/accuracy automatic recommendations

v0.3.105 Apr 2026

Quantized KV cache math accuracy improvements
1% accuracy loss isolated to matrix operations only

v0.3.53 Apr 2026

Default context window increased to 2 048 tokens
Math accuracy benchmarking suite added

v0.3.31 Apr 2026

Mixed KV cache type support

v0.3.230 Mar 2026

Bug fixes
Documentation updates

v0.2.9829 Mar 2026

Batch scheduler refinements
Build pipeline stabilization

Vedi tutte le release su GitHub →

Contribuisci alla roadmap

Apri un issue, vota le feature o contribuisci con del codice. EULLM si costruisce in pubblico e ogni voce conta.

Apri un issue Partecipa alla discussione