Notion AI is cool, but I use Obsidian. Also, I prefer to keep my notes offline. How do I mimic the write-for-you experience as close as possible?
In this post, I introduce my setup of enabling LLM-powered autocomplete feature in Obsidian.
Today’s recipe only involves 3 ingredients:
- Alpaca, a large language model (LLM) that you can use for free (non-commercial). It’s Stanford’s spin on GPT-3.5. I’m using the 13-billion-parameter edition.
- LocalAI, which serves a LLM of your choice with a web server that implements OpenAI’s API.
- Obsidian Text Generator Plugin for the autocomplete experience.
It takes just 3 lines to spin up a LocalAI server:
git clone https://github.com/go-skynet/LocalAI
# Compile with metal, because I'm on a M2 Max computer.
make BUILD_TYPE=metal build
./local-ai --models-path ./models/
This is just the API server; the models themselves are too big (~13 GiB) to be served from GitHub. HuggingFace is the right place to find large model files.
Caveat: Under the hood, LocalAI employs
llama.cpp to serve LLaMA-style models.
llama.cpp has gone through some breaking changes recently, rendering old model files unusable. As of the time of writing, if you search for LLaMA-style models on HuggingFace, the most popular downloads would be in the old format. Today,
llama.cpp requires models prepared in the GGML V3 format, which you can tell by the file name following the pattern
Let’s get Alpaca. Download alpaca.13b.ggmlv3.q8_0.bin to
./models/ of the cloned repo. Rename it to
gpt-3.5-turbo (no extension). This is because the Text Generation Plugin has a hardcoded list of model names, so we have to pretend that we have one from the list.
In the settings of this plugin, point the endpoint to
http://localhost:8080. Now you should be ready to go. The default trigger is double whitespace, so tap away.
Don’t be too excited about the performance, though. On my MacBook pro (
Mac14,5), it took 20s to give me this:
Apparently, Alpaca wasn’t a Portal fan.