Large Language Models¶

Notes on using LLMs for stuff on Linux.

Models¶

Public LLM models at ollama: https://ollama.com/search

Using an embedding model is important for RAG, and these can be downloaded from the same site.

DeepSeekR1 1.5b runs well on a 4-core machine.
Models like llama3.2 3b runs slower, but is still faster than my reading speed.
Larger models don't run well on my 4-core Intel CPU, and probably require a GPU.

Ollama and Thunderbird¶

Using this extension https://github.com/micz/ThunderAI

We need to set the OLLAMA_ORIGINS = moz-extension://* environment variable.

Add an environment variable to a "drop-in" unit for ollama's systemd unit:

sudo systemctl edit ollama.service --drop-in=thunderbird

The drop in unit should look like this:

### Editing /etc/systemd/system/ollama.service.d/thunderbird.conf
### Anything between here and the comment below will become the contents of the drop-in file

[Service]
Environment="OLLAMA_ORIGINS=moz-extension://*"

### Edits below this comment will be discarded

### ...

References:

Open WebUI¶

A web UI for using large language models locally.

Main site: https://openwebui.com/

RAG can be setup to work with your files, these are called "Knowledgebases" in Open WebUI: https://docs.openwebui.com/features/workspace/knowledge/

AppFlowy¶

In Arch, try using the binary AUR package: https://aur.archlinux.org/packages/appflowy-bin

This is because it requires an additional binary executable in a path that is not accesible by flatpak applications, and because the AppImage does not launch correcly and only shows a "No GL implementation is available" message.

References: