Efficient models.
Built for impact.

We help organizations build tailored AI systems that maximize
performance while minimizing cost and complexity.

We're actively hiring
Book a meeting
AWS Cloudflare Google Google Cloud Modal Supabase

Our mission

We build open-source small language models anyone can run, and scalable AI infrastructure enterprises can rely on.

01
kedy-small-3b — local inference

you Explain black holes in simple terms.

kedyA black hole is a region of space where gravity is so strong that nothing — not even light — can escape. They form when massive stars collapse under their own weight at the end of their lives. At the center lies a singularity, a point of infinite density. Around it sits the event horizon, the point of no return. Time itself slows near a black hole, and from outside, anything falling in appears to freeze at the edge forever.

112 tok/s · CPU only · offline

Open source

AI is for
everyone.

We build open-source, free models that run entirely on your own hardware — no cloud required, no subscription, no gatekeeping. Access to AI shouldn't depend on who you work for or how big your budget is.

Free forever Runs locally Apache 2.0
Download the models (soon)
02

Enterprise

Infrastructure
that scales.

From a single inference endpoint to a globally distributed cluster — Kedy gives enterprises the reliability, observability, and compliance they need to ship AI in production.

99.99%Uptime SLA
14Regions
48msp50 latency
Talk to sales
Frankfurt
Amsterdam
Paris
New York
São Paulo
Virginia
Singapore
Tokyo
Sydney
Mumbai

Capabilities

What Kedy can do for you.

From local experiments to global production. A single platform for every step of the AI journey.

Run anywhere

From a MacBook to an H100 cluster — the same model, everywhere.

Private by default

Your data never leaves your environment. GPDR in progress.

Fine-tune in minutes

Ship custom models trained on your data with a single command.

Auto-scaling

From one request to one million — inference that scales transparently.

Real-time observability

Latency, cost, and quality metrics for every token — in one dashboard.

API first

Clean, versioned APIs and SDKs for Python, TypeScript, Go, and Rust.

Developer first

One SDK.
Every environment.

A single, coherent interface for your laptop, a self-hosted cluster, or Kedy Cloud. Swap providers with a single line of code — your application doesn't even notice.

  • Streaming + structured output
  • Function calling, tools, and agents
  • Works with OpenAI-compatible clients
Read the docs
main.py
from kedy import Kedy

client = Kedy()

stream = client.chat(
    model="kedy-small-3b",
    messages=[
        {"role": "user",
         "content": "Explain quantum tunneling."}
    ],
    stream=True,
)

for chunk in stream:
    print(chunk.text, end="")

1K+

Model downloads

1.1M

Tokens per day

97%

Uptime SLA

48ms

Median latency

Europe-first infrastructure

Built in Europe.
Owned by you.

We believe AI should be sovereign. Our infrastructure is headquartered in the EU, our models are open-weight so your data never has to leave your country, and every product ships with full GDPR compliance out of the box.

  • GDPR in progress data processing agreements for every enterprise customer
  • EU data residencyFrankfurt, Amsterdam, and Paris regions
  • Open weightsrun entirely on-premise, zero cloud dependency
  • AI Act readytransparency reports and model cards for every release
Kedy infrastructure
Live inference

Global network

Built on a planet-scale mesh.

Our inference routes requests to the closest, healthiest replica — across 14 regions and counting. Automatic failover, zero cold-starts, and a single endpoint for every model in our catalog.

14Regions
Scale
0Cold starts

Start building
with Kedy today.

Deploy your first model in minutes. No credit card required.