Writing

On LLMs, Privacy and Inadequate Silicon

I’ll admit it feels trite writing a post here — especially in a world where people rely on LLMs so heavily that parsing whether something is an original thought seems almost pointless. That said, I’ve grown quite attached to ChatGPT, Claude, and Manus AI over the last few months. There’s real appeal in using them as thinking or conversation partners rather than merely answer engines. They’ve been enormously useful for pressure-testing models and interrogating assumptions: from crafting mental frameworks for everyday problems to building out complex financial models (‘how much more will I need if I decide to swap out the 4% rule for a 3% rule?’).

The more I use these tools, the guiltier I feel about feeding so much data into them. Naturally, I’ve spent time experimenting with local models like Mistral 7B and LLaMA 3 8B. Both are impressive and get the job done, but they’re not as smart or fast as ChatGPT — or as strong at coding as Claude (yes, guilty of vibe-coding).

Using Ollama and Open WebUI abstracts much of the complexity, but RAG (retrieval-augmented generation) pipelines have been necessary to make local setups even slightly multi-modal. The real trick now is figuring out what I do locally and what runs through ChatGPT or Claude.

Working locally isn’t just about privacy — it changes your relationship with the tool. Some lessons learned from running local models:

Speed matters — an M1 MacBook Air doesn’t quite cut it when you need speed or performance.

Context windows are a bigger limitation than parameter counts.

Local control forces discipline: understanding memory usage, disk space, prompt structure, and retrieval mechanics.

All of this is to say that the last couple of years have been a lot of fun. It genuinely feels like it’s accelerated my sense of wonder and pace of learning.

PS - I was overusing em dashes way before LLMs.