Looks like a big pivot on target audience from developers to regular users, at least on the homepage https://ollama.com/ as a product. Before, it was all about the CLI versions of Ollama for devs, now it's not even mentioned. At the bottom of the blog post it says:
> For pure CLI versions of Ollama, standalone downloads are available on Ollama’s GitHub releases page.
Nothing against that, just an observation.
Previously I tested several local LLM apps, and the 2 best ones to me were LM Studio [1] and Msty [2]. Will check this one out for sure.
One missing feature that the ChatGPT desktop app has and I think is a good idea for these local LLM apps is a shortcut to open a new chat anytime (Alt + Space), with a reduced UI. It is great for quick questions.
I just updated and a bit annoying by default gemma3:4b was selected that I don't have on my local. I guess would be nicer to default to one of the models that are present.
It was nice it started downloading it but also there was no indication I don't have that model before hand until I opened drop-down to see download buttons.
do you know why ollama hasn't updated its models in over a month while many fantastic models have been released in that time, most recently GLM 4.5? It's forcing me to use LM Studio which I for whatever reason absolutely do not prefer.
thank you guys for all your work on it, regardless
Thanks for including it. Ollama is very good at what it does. Including the feature is showing mindful growth in helping ollama be that skateboard, scooter, car, etc that the developer needs for LLM at that time. Making it appeal to casual/hobbyist is the right approach.
PS totally running windows here and using kesor/ollama-proxy if I need to make it externally available.
Are there any plans to improve observability toolset for developers? There is myriad of various AI chat apps, and there is no clear reason why another one from Ollama would be better. But Ollama is uniquely positioned to provide the best observability experience to its users because it owns the whole server stack, any other observability tool (eg Langfuse) may only treat it as a yet another API black box.
Does the new app make it easier for users to expose the Ollama daemon on the network (and mdns discovery )? It’s still trickier than needed for Home Assistant users to get started with Ollama (which tends to run on a different machine).
This caught me out yesterday. I was trying to move models onto external disk, and it seems to require re-installation? but there was no sign of the simple CLI option that was previously presented and I gave up.
As a developer feature request, it would be great if ollama could support more than one location at once, so that it is possible to keep a couple models 'live' but have the option to plug in an external disk with extra models being picked up auto-magically based on the ollama_models path please. Or maybe the server could present a simple html interface next to the API endpoint?
And just to say thanks for making these models easily accessible. I am agAInst AI generally, but it is nice to be able to have a play with these models locally. I havent found one that covers Zig, but appreciate the steady stream of new models to try. Thanks.
I think having a bash script as the linux installation is more of a stop-gap measure than truly supporting Linux. And ollama is FOSS compared to LM Studio and Msty (as someone who switched from ollama to LM Studio; I'm very happy to see the frontend development of ollama and an easier method of increasing the context length of a model).
this is actually positive even for devs. The more users have ollama installed then you can release some desktop ai app for them and don't have to bundle additional models in your own app. Easier to provide to such user free or cheaper subscription because you don't have additional costs. Latest Qwen30B models area really powerful.
Would be even better if there was a installation template that checks if Ollama is installed and if not download it as sub installation first checking user computer specs if enough RAM and fast enough CPU/GPU. Also API to prompt user (ask for permission) to install specific model if haven't been installed.
> Would be even better if there was a installation template that checks if Ollama is installed and if not download it as sub installation first..... Also API to prompt user (ask for permission) to install specific model if haven't been installed.
That's actually what we've done for our own App [1]. It checks if Ollama and other dependencies are installed. No model is bundled with it. We prompt user to install a model (you pick a model, click a button and we download the model; similar if you wish to remove a model). The aim is to make it quite simple for non-technical folks to use.
> One missing feature that the ChatGPT desktop app has and I think is a good idea for these local LLM apps is a shortcut to open a new chat anytime (Alt + Space), with a reduced UI. It is great for quick questions.
I’d heard of Msty and briefly tried it before. I checked the website again and it looks quite feature rich. I hadn’t known about LM Studio, and I see that it allows commercial use for free (which Matt doesn’t).
How would you compare and contrast between the two? My main use would be to use it as a tool with a chat interface rather than developing applications that talk to models.
I use Msty all the time and I love it. It just works and it's got all features I want now, including generating alternate responses, swapping models mid-chat, editing both sent messages and responses, ...
I also tried LM Studio a few months back. The interface felt overly complex and I got weird error messages which made it look like I'd have to manually fix errors in the underlying python environment. Would have been fine if it was for work, but I just wanted to play around with LLMs in my spare time so I couldn't be bothered.
That feature is available in HugstonOne with a new tab, among other features :)
Edit: Is incredible how unethical are all the other developers with their crappie spam unrelated. Ollama is a great app and pioneer of AI, cudos and my best thanks.
I am somewhat surprised that this app doesn't seem to offer any way to connect to a remote Ollama instance. The most powerful computer I own isn't necessarily the one I'm running the GUI on.
This. This. A thousand times this. I hate Windows / MacOS but love their desktops. I love Linux / BSD but hate their desktops. So my most expensive most powerful workstation is always a headless Linux machine that I ssh into from a Windows or MacOS toy computer. Unfortunately most developers do not understand this. Every time I run a command in the terminal and it tries to open a browser tab without printing the URL, it makes me want to scream and shout and retire from tech forever to be a plumber.
You can replace the xdg-open command (or whichever command is used on your linux system) with your own. Just program it to fire over the url to a waiting socket on your windows box, and have it automatically open there. The details are pretty easy to work out, and the result will be seamless.
You can work around this by using SSH port forwarding (ssh -L 11434:localhost:11434 user@remote) to connect to a remote Ollama instance, though native support would definitely be better.
But it seems like the GUI already connects over the network, no? In that case, why do you need to do user research for adding what is basically a command line option, at its simplest? It would probably take less time to add that than to write the comment.
It's definitely coming, there is no way they would leave such an important feature on the table. My guess is they are waiting so they can announce connections to their own servers.
For all of Electron's promise in being cross-platform, "I'll just press this button and ship this Electron app on Linux and everything will be fine" is not the current state of things. A lot of it is papercuts like glibc version aggravation, but GPU support is persistently problematic.
The Element app on Linux is currently broken (if you want to use encryption, so basically for everyone) due to an issue with Electron. Luckily it still works in a regular browser. I'm really baffled by how that can happen.
I believe power users or developers can already use this from CLI in Linux. This new app for Windows and MacOS shows this is intended for regular users.
Heads up, there’s a fair bit of pushback (justified or not) on r/LocalLLaMA about Ollama’s tactics:
Vendor lock-in: AFAIK it now uses a proprietary llama.cpp fork and builts its own registry on ollama.com in a kind of docker way (I heard docker ppl are actually behind ollama) and it's a bit difficult to reuse model binaries with other inference engines due to their use of hashed filenames on disk etc.
Closed-source tweaks: Many llama.cpp improvements haven’t been upstreamed or credited, raising GPL concerns. They since switched to their own inference backend.
Mixed performance: Same models often run slower or give worse outputs than plain llama.cpp. Tradeoff for convenience - I know.
Opaque model naming: Rebrands or filters community models without transparency, biggest fail was calling the smaller Deepseek-R1 distills just "Deepseek-R1" adding to a massive confusion on social media and from "AI Content Creators", that you can run "THE" DeepSeek-R1 on any potato.
Difficult to change Context Window default: Using Ollama as a backend, it is difficult to change default context window size on the fly, leading to hallucinations and endless circles on output, especially for Agents / Thinking models.
---
If you want better, (in some cases more open) alternatives:
llama.cpp: Battle-tested C++ engine with minimal deps and faster with many optimizations
ik_llama.cpp: High-perf fork, even faster than default llama.cpp
llama-swap: YAML-driven model swapping for your endpoint.
LM Studio: GUI for any GGUF model—no proprietary formats with all llama.cpp optimizations available in a GUI
Open WebUI: Front-end that plugs into llama.cpp, ollama, MPT, etc.
“I heard docker people are behind Ollama” um yes it’s founded by ex docker people and has raised multiple rounds of VC funding. The writing is on the wall - this is not some virtuous community project, it’s a profit driven startup and at the end of the day that is what they are optimizing for.
“Justified or not” — is certainly a useful caveat when giving the same credit to a few people who complain loudly with mostly unauthentic complaints.
> Vendor lock-in
That is, probably the most ridiculous of the statements. Ollama is open source, llama.cpp is open source, llamafiles are zip files that contain quantized versions of models openly available to be run with numerous other providers. Their llama.cpp changes are primarily for performance and compatibility. Yes, they run a registry on ollama.com for pre-packed, pre-quantized versions of models that are, again, openly available.
> Closed-source tweaks
Oh so many things wrong in a short sentence. Llama.cpp is MIT licensed, not GPL license. A proprietary fork is perfectly legitimate use. Also.. “proprietary“? The source code is literally available, including the patches, on GitHub in ollama/ollama project, in the “llama” folder with a patch file as recent as yesterday?
> Mixed Performance
Yes, almost anything suffers degraded performance when the goal is usability instead of performance. It is why people use C# instead of Assembly or punch cards. Performance isn’t the only metric, which makes this a useless point.
> Opaque model name
Sure, their official models have some ambiguities sometimes. I don’t know know that is the “problem” that people make it out to be when ollama is designed for average people to run models, and so a decision like “ollama run qwen3” not being the absolutely maximum best option possible rather than the option most people can run makes sense. Do really think it is advantageous or user friendly, when Tommy wants to try out “Deepseek-r1” on his potato laptop that a 671b parameter model too large to fit on almost anything consumer computer is the right choice and that it is instead meant as a “deception”? That seems…disingenuous. Not to mention, they are clearly listed as such on ollama.com, where in black and white it says the deep seek-r1 by default refers with the qwen model, and that the full model is available as deep seek-r1:671b
> Context Window
Probably the only fair and legitimate criticism of your entire comment.
I’m not an ollama defender or champion, couldn’t care about the company, and I barely use ollama (mostly just to run qwen3-8b for embedding). It really is just that most of these complaints you’re sharing from others seem to have TikTok-level fact checking.
I gave the Ollama UI a try on Windows after using the CLI service for a while.
- I like the simplicity. This would be perfect for setting up a non-technical friend or family member with a local LLM with just a couple clicks
- Multimodal and Markdown support works as expected
- The model dropdown shows both your local models and other popular models available in the registry
I could see using this over Open WebUI for basic use cases where one doesn't need to dial in the prompt or advanced parameters. Maybe those will be exposed later. But for now - I feel the simplicity is a strength.
Update 2: I've been using the new Ollama desktop UI on Windows for a couple days now (released 4 days ago).
- I still appreciate the simplicity to the point where I use it more the Open WebUI - no logins, no settings, just chat
- I wish the model select in the chat box was either moved or was more subtle, currently it visually draws interest to something that doesn't change much
- Chat summaries sometimes overflow in the chat history area
- Small nit but the window uses the default app icon on Windows rather than the Ollama icon
Small update: thinking models also work well. I like that it shows the thinking stream in a fainter style while it generates, then hides it to show the final output when it's ready. The thinking output is still available with a click.
Another commenter mentioned not being able to point the new UI to a remote Ollama instance - I agree, that would be super handy for running the UI on a slow machine but inferring on something more powerful.
I've been on something of a quest to find a really good chat interface for LLMs.
Most import feature for me is that I want to be able to chat with local models, remote models on my other machines, and cloud models (OpenAI API compatible). Anything that makes it easier to switch between models or query them simultaneously is important.
Here's what I've learned so far:
* Msty - my current favorite. Can do true simultaneous requests to multiple models. Nice aesthetic. Sadly not open source. Have had some freezing issues on Linux.
* Jan.ai - Can't make requests to multiple models simultaneously
* LM Studio - Not open source. Doesn't support remote/cloud models (maybe there's a plugin?)
* GPT4All - Was getting weird JSON errors with openrouter models. Have to explicitly switch between models, even if you're trying to use them from different chats.
Still to try: Librechat, Open WebUI, AnythingLLM, koboldcpp.
I've been in the same quest for a while. Here's my list, not a recommendation or endorsement list, just a list of alternative clients I've considered, tried or am still evaluating:
- chatbox - https://github.com/chatboxai/chatbox - free and OSS, with a paid tier, supports MCP and local/remote, has a local KB, works well so far and looks promising.
- macai - https://github.com/Renset/macai simple client for remote APIs, does not support image pasting or MCP or anything really, very limited, crashes.
- typingmind.com - web, with a downloadable (if paid) version. Not OSS, but one-time payment, indie dev. One of the first alt chat clients I've ever tried, not using it anymore. Somewhat clunky gui, but ok. Supports MCP, haven't tried it it.
- Open WebUI - deployed for our team so that we could chat through many APIs. Works well for a multi-user web-deployment, but image generation hasn't been working. I don't like it as a personal client though, buggy sometimes but gets frequent fixes fortunately.
- jan.ai - it comes with popular models pre-populated listed, which makes it harder to plug into custom or local model servers. But it supports local model deployment within the app (like what ollama is announcing) which is good for people who don't want to deal with starting a server. I haven't played with it enough, but I personally prefer to deploy a local server (ie ollama, litellm...) and then just have the chat gui app give me a flexible endpoint configuration for adding custom models to it.
I'm also wary of evil actors deploying chat GUIs just to farm your API keys. You should be too. Use disposable api keys, watch usage, refresh with new keys once in a while after trying clients.
do you have any screenshots? the home page shows a picture of a tamagotchi but none of the actual chat interface, which makes me wonder if I’m outside of the target audience
Last I tried OpenWebUI (A few months ago), it was pretty painful to connect non-OpenAI externally hosted models. There was a workaround that involved installing a 3rd party "function" (or was it a "pipeline"?), but it didn't feel smooth.
Is this easier now? Specifically, I would like to easily connect anthropic models just by plugging in my API key.
CherryStudio is a power tool for this case https://github.com/CherryHQ/cherry-studio -- has MCP, search, personas, and reasoning support too. i use it heavily with llama.cpp + llama-swap
I've been using AnythingLLM for a couple months now and really like it. You can organize different "Workspaces" which are models + specific prompts and it supports Ollama along with the major LLM providers.
I have it running in a docker container on a raspberry pi and then I use Tailscale to make it accessible anywhere. It looks good on mobile too so it's pretty seamless.
I use that and Raycast's Claude extension for random questions and that's pretty much does everything I want.
I like webUI but it’s weird and conplicated how you have to set up the different models (via text files in the browser, the instructions contains a lot of confusing terms). Librechat is nice but I can’t get it to not log me out every 5 min which makes it unusable. I’ve been told it keeps you logged in when using https but I use tailscale so that is difficult (when doing multiple services on a single host).
Build your own! It's a great way to learn, keeps you interested in the latest developments. Plus you get to try out cool UX experiments and see what works. I built my own interface back in 2023 and have been slowly adding to it since. I added local models via MLX last month. I'm surprised more devs aren't rolling their own interface, they are easy to make and you learn a lot.
Open WebUI is definitely what you want. Supports any OpenAI-compatible provider, lets you manually configure your model list and settings for each model in a very user-friendly way, switching between models is instant, and it lets you send the same prompt to multiple models simultaneously in the same chat and displays them side by side.
gptel in emacs does this. You can run the same prompt against different models in separate emacs windows (local or via api w/ keys) at the same time to compare outputs. I highly recommended it. https://github.com/karthink/gptel
Our team has been using openwebui as the interface for our stack of open source models we run internally at work and it’s been fantastic! It has a great feature set, good support for MCPs, and is easy to stand up and maintain.
If you’re a power user of these LLMs and have coding experience, I actually recommend just whipping together your own bespoke chat UI that you can customize however you like. Grab any OpenAI compatible endpoint for inference and a frontend component framework (many of which have added standard Chat components) - the rest is almost trivial. I threw one together in a week with Gemini’s assistance and now I use it every day. Is it production ready? Hell no but it works exactly how I want it to and whenever I find myself saying “I wish it could do XYZ…” I just add it.
Kinda odd to be so dismissive of this mindset given this websites title. Whipping up your own chatui really is not that hard and is a pretty fun exercise. Knowing how your tools work and being able to tweak them to your specific usecases kinda rules!
I only did it once some 15 years back (in a happy memory) using LFS. It took about a week to get to a functional system with basic necessities. A code finetuned model can write a functional chat UI with all common features and a decent UX in under a minute.
I have been exploring AI and LLMs. I built my own AI chat bot using Python [1], and then [2] AI SDK from Vercel and OpenAI compatible API endpoints. And eventually build a product around it.
this is not coder
this help typing instructions. Coding is different. For example look at my repository and tell me how refactorizing it, write a new function etc.
In my opinion You must change name.
Yeah, I have one which lets me read a pdf and chat side by side, one which is integrated into my rss feed, one with insanely aggressive memory features (experimental) etc etc :)
> I don't know if parenting hits the "developer-tinkerer class" harder than others, but damn.
I sort of suspect so? Devs of parenting age trend towards being neurospicy, and dev work requires sustained attention with huge penalties for interruptions.
Not surprising; Ollama is set on becoming the standard interface for companies to deploy "open" models. The focus on "local" is incidental, and likely not long term. I'm sure Ollama is going to announce a plan to use "open" models through their own cloud-based API using this app.
Strongly disagree with this. It is the default go-to for companies that cannot use cloud-based services for IP or regulatory reasons (think of defense contractors). Isn't that the main reason to use "open" models, which are still weaker than closed ones?
> Ollama is set on becoming the standard interface for companies to deploy "open" models.
That's not what I've been seeing, but obviously my perspective (as anyone's) is limited. What I'm seeing is deployments of vLLM, SGLang, llama.cpp or even HuggingFace's Transformers with their own wrapper, at least for inference with open weight models. Somehow, the only place where I come across recommendations for running Ollama was on HN and before on r/LocalLlama but not even there as of late. The people who used to run Ollama for local inference (+ OpenWebUI) now seem to mostly be running LM Studio, myself included too.
> For pure CLI versions of Ollama, standalone downloads are available on Ollama’s GitHub releases page.
Nothing against that, just an observation.
Previously I tested several local LLM apps, and the 2 best ones to me were LM Studio [1] and Msty [2]. Will check this one out for sure.
One missing feature that the ChatGPT desktop app has and I think is a good idea for these local LLM apps is a shortcut to open a new chat anytime (Alt + Space), with a reduced UI. It is great for quick questions.
[1] https://lmstudio.ai/
[2] https://msty.app/
In fact, there are many self-made prototypes before this from different individuals. We were hooked, so we built it for ourselves.
Ollama is made for developers, and our focus in continually improving Ollama's capabilities.
It was nice it started downloading it but also there was no indication I don't have that model before hand until I opened drop-down to see download buttons.
But of course nice job guys.
I really like using ollama as a backend to OpenWebUI.
I don't have any windows machines and I don't work primarily on macos, but I understand that's where all the paying developers are, in theory.
Did y'all consider a partnership with one of the existing UI and bundle that, similar to duckdb approach?
thank you guys for all your work on it, regardless
PS totally running windows here and using kesor/ollama-proxy if I need to make it externally available.
Lots of people trying to being, and many with Ollama, and helping to create beginners is never a bad thing with tech.
As a developer feature request, it would be great if ollama could support more than one location at once, so that it is possible to keep a couple models 'live' but have the option to plug in an external disk with extra models being picked up auto-magically based on the ollama_models path please. Or maybe the server could present a simple html interface next to the API endpoint?
And just to say thanks for making these models easily accessible. I am agAInst AI generally, but it is nice to be able to have a play with these models locally. I havent found one that covers Zig, but appreciate the steady stream of new models to try. Thanks.
Lots of people trying to being, and many with Ollama, and helping to create beginners is never a bad thing with tech.
Many things can be for both developers and end-users. Developers can use the API directly, end users, have more choices.
Would be even better if there was a installation template that checks if Ollama is installed and if not download it as sub installation first checking user computer specs if enough RAM and fast enough CPU/GPU. Also API to prompt user (ask for permission) to install specific model if haven't been installed.
That's actually what we've done for our own App [1]. It checks if Ollama and other dependencies are installed. No model is bundled with it. We prompt user to install a model (you pick a model, click a button and we download the model; similar if you wish to remove a model). The aim is to make it quite simple for non-technical folks to use.
1) https://ai.nocommandline.com/
What's wrong with the name? Are you referring to the GPT trademark? That was rejected.
This is exactly what I've implemented for my Qt C++ app: https://www.get-vox.com
How would you compare and contrast between the two? My main use would be to use it as a tool with a chat interface rather than developing applications that talk to models.
I've used Msty but it seems like LM studio is moving faster, which is kind of important in this space. For example Msty still doesn't support MCP
I also tried LM Studio a few months back. The interface felt overly complex and I got weird error messages which made it look like I'd have to manually fix errors in the underlying python environment. Would have been fine if it was for work, but I just wanted to play around with LLMs in my spare time so I couldn't be bothered.
Deleted Comment
Deleted Comment
Deleted Comment
Also is there a link to the source?
> For pure CLI versions of Ollama, standalone downloads are available on Ollama’s GitHub releases page.
Sound like closed source. Plus, As I check, the app seem to be tauri app, as it use system webview instead of chromium.
Deleted Comment
https://ollama.com/download
this app got gui.
If you want better, (in some cases more open) alternatives:
> Vendor lock-in
That is, probably the most ridiculous of the statements. Ollama is open source, llama.cpp is open source, llamafiles are zip files that contain quantized versions of models openly available to be run with numerous other providers. Their llama.cpp changes are primarily for performance and compatibility. Yes, they run a registry on ollama.com for pre-packed, pre-quantized versions of models that are, again, openly available.
> Closed-source tweaks
Oh so many things wrong in a short sentence. Llama.cpp is MIT licensed, not GPL license. A proprietary fork is perfectly legitimate use. Also.. “proprietary“? The source code is literally available, including the patches, on GitHub in ollama/ollama project, in the “llama” folder with a patch file as recent as yesterday?
> Mixed Performance
Yes, almost anything suffers degraded performance when the goal is usability instead of performance. It is why people use C# instead of Assembly or punch cards. Performance isn’t the only metric, which makes this a useless point.
> Opaque model name
Sure, their official models have some ambiguities sometimes. I don’t know know that is the “problem” that people make it out to be when ollama is designed for average people to run models, and so a decision like “ollama run qwen3” not being the absolutely maximum best option possible rather than the option most people can run makes sense. Do really think it is advantageous or user friendly, when Tommy wants to try out “Deepseek-r1” on his potato laptop that a 671b parameter model too large to fit on almost anything consumer computer is the right choice and that it is instead meant as a “deception”? That seems…disingenuous. Not to mention, they are clearly listed as such on ollama.com, where in black and white it says the deep seek-r1 by default refers with the qwen model, and that the full model is available as deep seek-r1:671b
> Context Window
Probably the only fair and legitimate criticism of your entire comment.
I’m not an ollama defender or champion, couldn’t care about the company, and I barely use ollama (mostly just to run qwen3-8b for embedding). It really is just that most of these complaints you’re sharing from others seem to have TikTok-level fact checking.
- I like the simplicity. This would be perfect for setting up a non-technical friend or family member with a local LLM with just a couple clicks
- Multimodal and Markdown support works as expected
- The model dropdown shows both your local models and other popular models available in the registry
I could see using this over Open WebUI for basic use cases where one doesn't need to dial in the prompt or advanced parameters. Maybe those will be exposed later. But for now - I feel the simplicity is a strength.
- I still appreciate the simplicity to the point where I use it more the Open WebUI - no logins, no settings, just chat
- I wish the model select in the chat box was either moved or was more subtle, currently it visually draws interest to something that doesn't change much
- Chat summaries sometimes overflow in the chat history area
- Small nit but the window uses the default app icon on Windows rather than the Ollama icon
Another commenter mentioned not being able to point the new UI to a remote Ollama instance - I agree, that would be super handy for running the UI on a slow machine but inferring on something more powerful.
Most import feature for me is that I want to be able to chat with local models, remote models on my other machines, and cloud models (OpenAI API compatible). Anything that makes it easier to switch between models or query them simultaneously is important.
Here's what I've learned so far:
* Msty - my current favorite. Can do true simultaneous requests to multiple models. Nice aesthetic. Sadly not open source. Have had some freezing issues on Linux.
* Jan.ai - Can't make requests to multiple models simultaneously
* LM Studio - Not open source. Doesn't support remote/cloud models (maybe there's a plugin?)
* GPT4All - Was getting weird JSON errors with openrouter models. Have to explicitly switch between models, even if you're trying to use them from different chats.
Still to try: Librechat, Open WebUI, AnythingLLM, koboldcpp.
Would love to hear any other suggestions.
- chatbox - https://github.com/chatboxai/chatbox - free and OSS, with a paid tier, supports MCP and local/remote, has a local KB, works well so far and looks promising.
- macai - https://github.com/Renset/macai simple client for remote APIs, does not support image pasting or MCP or anything really, very limited, crashes.
- typingmind.com - web, with a downloadable (if paid) version. Not OSS, but one-time payment, indie dev. One of the first alt chat clients I've ever tried, not using it anymore. Somewhat clunky gui, but ok. Supports MCP, haven't tried it it.
- Open WebUI - deployed for our team so that we could chat through many APIs. Works well for a multi-user web-deployment, but image generation hasn't been working. I don't like it as a personal client though, buggy sometimes but gets frequent fixes fortunately.
- jan.ai - it comes with popular models pre-populated listed, which makes it harder to plug into custom or local model servers. But it supports local model deployment within the app (like what ollama is announcing) which is good for people who don't want to deal with starting a server. I haven't played with it enough, but I personally prefer to deploy a local server (ie ollama, litellm...) and then just have the chat gui app give me a flexible endpoint configuration for adding custom models to it.
I'm also wary of evil actors deploying chat GUIs just to farm your API keys. You should be too. Use disposable api keys, watch usage, refresh with new keys once in a while after trying clients.
Works fully local, privacy first, and it's a native app (Swift for macOS, WPF for Windows)
Is this easier now? Specifically, I would like to easily connect anthropic models just by plugging in my API key.
It feels a bit less polished but has more functions that run locally and things work better out of the box.
My favorite thing is that I can just type my own questions / requests in markdown so I can get formatting and syntax highlighting.
Electron. Python backend. Can talk to Ollama and other backends.
Need help with design and packaging.
https://github.com/adsharma/ask-me-anything
I can create workflows that use multiple models to achieve different goals.
Just download some tool and be productive within seconds, I'd say.
1. VT.ai https://github.com/vinhnx/VT.ai Python
2. VT Chat https://vtchat.io.vn: my own product
With a bit of help from ChatGPT etc., it was trivial to make, and I use it everyday now. I may add DDG and github search to it soon too.
Either directly or use it as a base for your own bespoke experience.
Please avoid internet tropes on HN.
https://news.ycombinator.com/newsguidelines.html
I sort of suspect so? Devs of parenting age trend towards being neurospicy, and dev work requires sustained attention with huge penalties for interruptions.
Yeah, my wife would murder me as our kids yelled at me for various things
Strongly disagree with this. It is the default go-to for companies that cannot use cloud-based services for IP or regulatory reasons (think of defense contractors). Isn't that the main reason to use "open" models, which are still weaker than closed ones?
Any whiff of a cloud service and the lawyers will freak out.
That's why we run models via Ollama on our laptops (M-series is crazy powerful) and a few servers on the intranet for more oomph.
LM Studio changed their license to allow commercial use without "call me" pricing, so we might look into that more too.
That's not what I've been seeing, but obviously my perspective (as anyone's) is limited. What I'm seeing is deployments of vLLM, SGLang, llama.cpp or even HuggingFace's Transformers with their own wrapper, at least for inference with open weight models. Somehow, the only place where I come across recommendations for running Ollama was on HN and before on r/LocalLlama but not even there as of late. The people who used to run Ollama for local inference (+ OpenWebUI) now seem to mostly be running LM Studio, myself included too.