A batteries-included local search engine for your data and code that you can talk to.
Point it at your files, notes, and code; ask questions in plain English. Every answer links back to the file and line. Point it at nothing and it's still a clean local-AI chat with the model catalog wired up; cloud models too if you bring an API key or use a frontier agent over MCP.
It runs on your computer. Your files stay on disk; lilbee uses a cloud model only when you pick one.
First run: a setup wizard pulls a chat model and an embedder, then drops you straight into chat.
One-minute sweep through every screen: setup wizard, chat with citations, model catalog, settings, task center, palette.
Streaming replies with clickable citations back to the file and line.
/add <path> copies a file or folder into your library and indexes it; ask questions while it syncs.
/crawl <url> fetches a page into your library, then answers against it with a page citation.
Browse models on Hugging Face Hub, pull one live, switch a role without leaving the terminal.
50+ settings: search depth, reranking, sampling, parsers. Sane defaults; tune the moment you want to.
lilbee talking to lilbee. An agent indexes lilbee's own source through lilbee's MCP server, then answers questions about how lilbee works, with file:line citations.
Same setup, talking to a PDF: the agent asks lilbee to index cv-manual.pdf, then builds a fuse table with page citations.
←/→ scrub. More on the recordings: the full reel →
needs Python 3.11+ . intel mac: add --extra-index-url https://lilbee.sh/cpu/ . extras below, e.g. pip install --pre 'lilbee[crawler,litellm]'
uv fetches a Python for you if needed . extras: uv tool install --prerelease=allow 'lilbee[crawler]'
prebuilt bundle: its own Python interpreter and llama.cpp backend, nothing to compile . clears macOS quarantine for you . the [crawler] / [litellm] / [graph] extras are already included
package lilbee on the AUR . works with yay / pacaur / any helper . wraps the Linux x86_64 release binary, so the [crawler] / [litellm] / [graph] extras are already included
image on the GitHub Container Registry . data lives at /home/lilbee/data, REST API on port 8000 . wraps the release binary, so the [crawler] / [litellm] / [graph] extras are already included
flake at github:tobocop2/lilbee . wraps the release binary, so the [crawler] / [litellm] / [graph] extras are already included . on Linux it bundles glibc and the Vulkan loader via an FHS env so it runs on bare NixOS
single binary, bundles its own Python runtime, no pip needed . the [crawler] / [litellm] / [graph] extras are already included . macOS arm64 and Windows builds on the releases page →
unsigned: the macOS arm64 and Windows builds aren't code-signed. If macOS blocks it, run xattr -d com.apple.quarantine ./lilbee-macos-arm64 (Homebrew does this for you).
only for the last bit of CUDA-native speed . the default wheel already uses your NVIDIA GPU through Vulkan . cu121 / cu124 indexes also available . works with uv tool install too
for hacking on it or contributing . needs git and uv
pip / uv install, add the name in brackets, e.g. 'lilbee[crawler,litellm]'. The binary, Homebrew, AUR, Nix, and Docker builds bundle all three already. lilbee works without them.It's the model, the search through your files, and the chat, all in one program. Run it when you want, close it when you're done; nothing left running in the background, no container to keep alive. Want something long-running? Use the command line and manage it yourself.
- a model server, always running
- model files fetched by hand
- a vector database to stand up
- code wiring them together
- a separate app for the interface
- often a container around it all
- the model runtime (llama.cpp) and the vector index (LanceDB) run inside lilbee, not as separate services to stand up
- use it as a full-screen terminal app, a command-line tool, a Model Context Protocol server, a web API, or a Python library
- a built-in model catalog: browse and pull straight from Hugging Face Hub, no hunting for model files yourself
- a scoped library per project, so each domain stays its own clean encyclopedia
- runs on a laptop or headless over a remote shell; move it between machines
feed it your files
Point it at a folder: your man pages, a pile of PDFs, your notes, a codebase. Then talk to them. Every answer tells you the file and line. Each project gets its own library, so nothing bleeds across.
pair it with your agent
Pair it with your favorite agent over MCP. It reads the real code and docs before it answers, cites the file and line, and says "I don't know" instead of guessing.
websites, offline
Crawl a docs site or a wiki, turn it into markdown, and keep it. Search and chat with it offline, even after it goes down.
scans & OCR
Old scans and photos go through OCR or a local vision model and come out as searchable markdown, layout intact.
Answers are only as good as the model you pick and the settings behind it. lilbee ships sane defaults, but exposes 50+ settings you can tune: search, the answers, how your files get read.
lilbee stands on established open-source projects and wires them into one program.
- Kreuzbergparses documents
- LanceDBembedded vector index
- llama.cppruns models locally
- tree-sitterchunks code
- crawl4ai + Playwrightcrawl the web
- Textualdraws the terminal