bvbvbvbvbvbvbvbvbvbvbv comments on How much I’m paying for AI productivity software (and the future of AI use)

bvbvbvbvbvbvbvbvbvbvbv 11 Oct 2024 21:34 UTC
4 points
0
Sharing my setup too:

Personnaly I’m just self hosting a bunch of stuff:
- litellm proxy, to connect to any llm provider
- langfuse for observability
- faster whisper server, the v3 turbo ctranslate2 versions takes 900mb of vram and are about 10 times faster than I speak
- open-webui, as it’s connected to litellm and ollama, i avoid provider lock in and keep all my messages on my backend instead of having some at openai, some at anthropic, etc. Additionally it supports artifacts and a bunch of other nice features. It also allows me to craft my perfecr prompts. And to jailbreak when needed.
- piper for now for tts but plan on switching to a selfhosted fish audio.
- for extra privacy I have a bunch of ollama models too. Mistral Nemo seems to be quite capable. Otherwise a few llama3, qwen2 etc.
- for embeddings either bge-m3 or some self hosted jina.ai models.
I made a bunch of scripts to pipe my microphone / speaker / clipboard / llms together for productivity. For example I press 4 times on shift, speak, then shift again, and voila what I said was turned into an anki flashcard.

As providers, I mostly rely on openrouter.ai which allows to swap between providers without issue. These last few months I’ve been using sonnet 3.5 but change as soon as there’s a new frontier model.

For interacting with codebases I use aider.

So at the end all my cost comes from API calls and none from subscriptions.