Open weights, multilingual, 32k context.
I have ~50 million sentences from english project gutenberg novels embedded with this.
Okay, I get it. Lisp is great.
Where should I start? It wasn't like I was planning on doing anything else at work next week...
This is the version i read:
https://www.abebooks.com/9780023397639/Little-LISPer-Third-E...
However for complex projects IMO one must read what was written by the llm … every actual word.
When it ‘got away’ from me, in each case I left something in the llm written markdown that I should have removed.
99% “I can ask for that later” and 1% “that’s a good idea i hadn’t considered” might be the right ratio when reading an llm generated plan/spec/workunit.
Breaking work into single context passes … 50-60k tokens in sonnet 4.5 has had typically fantastic results for me.
My side project is using lean 4 and a carelessly left in ‘validate’ rather than ‘verify’ lead down a hilariously complicated path equivalent to matching an output against a known string.
I recovered, but it wasn’t obvious to me that was happening. I however would not be able to write lean proofs myself, so diagnosing the problem and fixing it is a small price to be able to mechanically verify part of my software is correct.
And it is a developer feature hidden from end users. e.g. - In your ollama example, does the developer ask end users to install ollama? Does the dev redistribute ollama and keep it updated?
The ONNX format is pretty much a boring de-facto standard for ML model exchange. It is under the linux foundation.
The ONNX Runtime is a microsoft thing, but it is an MIT licensed runtime for cross language use and cross OS/HW platform deployment of ML models in the ONNX format.
That bit needs to support everything because Microsoft itself ships software on everything.(Mac/linux/iOS/Android/Windows.
ORT — https://onnxruntime.ai
Here is the Windows ML part of this —https://learn.microsoft.com/en-us/windows/ai/new-windows-ml/...
The primary value claims for Windows ML (for a developer using it)— This eliminates the need to: Bundle execution providers for specific hardware vendors
Create separate app builds for different execution providers
Handle execution provider updates manually.
Since ‘EP’ is ultra-super-techno-jargon:
Here is what GPT-5 provides:
Intensional (what an EP is)
In ONNX Runtime, an Execution Provider (EP) is a pluggable backend that advertises which ops/kernels it can run and supplies the optimized implementations, memory allocators, and (optionally) graph rewrites for a specific target (CPU, CUDA/TensorRT, Core ML, OpenVINO, etc.). ONNX Runtime then partitions your model graph and assigns each partition to the highest-priority EP that claims it; anything unsupported falls back (by default) to the CPU EP.
Extensional (how you use them) • You pick/priority-order EPs per session; ORT maps graph pieces accordingly and falls back as needed. • Each EP has its own options (e.g., TensorRT workspace size, OpenVINO device string, QNN context cache). • Common EPs: CPU, CUDA, TensorRT (NVIDIA), DirectML (Windows), Core ML (Apple), NNAPI (Android), OpenVINO (Intel), ROCm (AMD), QNN (Qualcomm).
Melinda French Gates back when she was Melinda French had a part in Clippy.
“Melinda French (then the fiancée of Bill Gates) was the project manager of Microsoft Bob”
Microsoft Bob is where Clippy was born.
Reference: https://www.artsy.net/article/artsy-editorial-life-death-mic...