monkmartinez (u/monkmartinez)

monkmartinez commented on I launched 17 side projects. Result? I'm rich in expired domains · Posted by u/cesargstn

I suggest you don't code at all. Do a very detailed mind map of the project. It will include, the users you are targeting, what's the goal -- why are they better off with your project, how will you get customers, and there is so much before you start to code.

As a learning project, copy a successful project and mind map it, program it and find customers. You will learn so much that you can apply to whatever project you come up with. You'll get frustrated because you will need to go out and learn a lot but power thru it. You'll be better off for the hard work. Good luck!

monkmartinez · a month ago

L.A.M.E.[1] !!!! 99.9999999999% of the fun is the code.

[1] Lacking Any Major Excitement

monkmartinez commented on Copyparty – Turn almost any device into a file server github.com/9001/copyparty... · Posted by u/saint11

monkmartinez · 2 months ago

This is awesome. The readme is fun as heck and I just want to use the software based on that. I see nothing but complaints about nextcloud and others on r/selfhosted. I can't wait to try this out.

monkmartinez commented on Magistral — the first reasoning model by Mistral AI mistral.ai/news/magistral... · Posted by u/meetpateltech

danielhanchen · 3 months ago

I made some GGUFs for those interested in running them at https://huggingface.co/unsloth/Magistral-Small-2506-GGUF

ollama run hf.co/unsloth/Magistral-Small-2506-GGUF:UD-Q4_K_XL

or

./llama.cpp/llama-cli -hf unsloth/Magistral-Small-2506-GGUF:UD-Q4_K_XL --jinja --temp 0.7 --top-k -1 --top-p 0.95 -ngl 99

Please use --jinja for llama.cpp and use temperature = 0.7, top-p 0.95!

Also best to increase Ollama's context length to say 8K at least: OLLAMA_CONTEXT_LENGTH=8192 ollama serve &. Some other details in https://docs.unsloth.ai/basics/magistral

monkmartinez · 3 months ago

At the risk of dating myself; Unsloth is the Bomb-dot-com!!! I use your models all the time and they just work. Thank you!!! What does llama.cpp normally use if not "jinja" for their templates?

monkmartinez commented on Precious Plastic is in trouble preciousplastic.com//news... · Posted by u/diggan

monkmartinez · 3 months ago

US Centric view:

I would love to open a workspace. Full stop.

However, due to the price of the shredder and the tools required to transform the plastic into new forms; One needs to have a dedicated space with a lot of power. Then you need to secure a source of plastic. You would think this part would be easy, I mean that is the whole premise of this org's existence, right? You would be wrong in that assumption. There is big money in "recycling" in the US. From the collection, sorting, and distribution of recycled materials... someone already has a contract to legally "do it."

I am bummed to see them in this position. There seems to be a few hotspots around the world where this would really work. They aren't near me, that is for sure.

monkmartinez commented on Xiaomi MiMo Reasoning Model github.com/XiaomiMiMo/MiM... · Posted by u/thm

o11c · 4 months ago

Pretty sure the whole reason Ollama uses raw hashes everywhere is to avoid copying the whole NN gigabytes every time.

monkmartinez · 4 months ago

Maybe I am doing something wrong! When I change parameters on the modelfile, the whole thing is copied. You can't just edit the file as far as I know, you have to create another 38GB monster to change num_ctx to a reasonable number.

monkmartinez commented on Xiaomi MiMo Reasoning Model github.com/XiaomiMiMo/MiM... · Posted by u/thm

rahimnathwani · 4 months ago

Sorry, I should have been clearer.

I meant when you download a gguf file from huggingface, instead of using a model from ollama's library.

monkmartinez · 4 months ago

ollama pull hf.co/unsloth/Qwen3-30B-A3B-GGUF:Q4_K_M and the modelfile comes with it. It may have errors in the template or parameters this way. It has to be converted to GGUF/GGML prior to using it this way. You can, of course, convert and create the specific ollama model from bf16 safetensors as well.

monkmartinez commented on Xiaomi MiMo Reasoning Model github.com/XiaomiMiMo/MiM... · Posted by u/thm

rahimnathwani · 4 months ago

When you guys use gguf files in ollama, do you normally create a modelfile to go with it, or just hope that whatever default ollama has work with the new model?

https://github.com/ollama/ollama/blob/main/docs%2Fmodelfile....

monkmartinez · 4 months ago

If you ollama pull <model> the modelfile will be downloaded along with the blob. To modify the model permanently, you can copypasta the modelfile into a text editor and then create a new model from the old modelfile with the changes you require/made.

Here is my workflow when using Open WebUI:

1. ollama show qwen3:30b-a3b-q8_0 --modelfile

2. Paste the contents of the modelfile into -> admin -> models -> OpenwebUI and rename qwen3:30b-a3b-q8_0-monkversion-1

3. Change parameters like num_gpu 90 to change layers... etc.

4. Keep | Delete old file

Pay attention to the modelfile, it will show you something like this: # To build a new Modelfile based on this, replace FROM with: # FROM qwen3:30b-a3b-q8_0 and you need to make sure the paths are correct. I store my models on a large nvme drive that isn't default ollama as an example of why that matters.

EDIT TO ADD: The 'modelfile' workflow is a pain in the booty. It's a dogwater pattern and I hate it. Some of these models are 30 to 60GB and copying the entire thing to change one parameter is just dumb.

However, ollama does a lot of things right and it makes it easy to get up and running. VLLM, SGLang, Mistral.rs and even llama.cpp require a lot more work to setup.

monkmartinez commented on Amazon introduces Nova Chat aboutamazon.com/news/inno... · Posted by u/ao98

theshrike79 · 5 months ago

> Thank you for your interest in Amazon Nova. At this time, we are only accepting customers in the US.

monkmartinez · 5 months ago

I can't tell if this is sarcastic or genuine surprise.

monkmartinez commented on All clothing is handmade (2022) ruthtillman.com/post/all-... · Posted by u/panic

NoMoreNicksLeft · 6 months ago

Wife bought a Saddleback Leather wallet for me. I suspect a grandson will inherit it. I wish I could afford other products of theirs. The leather is thick enough that even if a stitch came out, I figure it'd be worth having repaired.

A wallet that only lasts 10 years seems disposable at this point.

monkmartinez · 6 months ago

My question is; Why even have a "wallet" at this point?

My teens use these little things that attach to their phones to hold gym key, debit card and ID.

I use a traditional "wallet" or billfold as my abuelo used to call them, but I am positively a dinosaur using one. Also, the darn thing hurts my back if I leave it in my back pocket while driving/sitting.

Heck, I have been eyeing those crossbody bags or saccoche to hold the things that are in my wallet.

monkmartinez commented on A Practical Guide to Running Local LLMs spin.atomicobject.com/run... · Posted by u/philk10

tegiddrone · 6 months ago

Is there a spreadsheet out there benchmarking local LLM and hardware configs? I want to know if I should even bother with my coffeelake xeon server or if it is something to consider for my next gaming rig.

monkmartinez · 6 months ago

Its really not hard to test with llamafile or ollama, especially with smaller 7B models. Just have a go.

There are a bazzillion and one hardware combinations where even RAM timings can make a difference. Offloading a small portion to a GPU can make a HUGE difference. Some engines have been optimized to run on Pascal with CUDA compute below 7.0, and some have tricks for newer gen cards with modern CUDA. Some engines only run on Linux while others are completely x-platform. It is truly the wild-west of combinatorics as they relate to hardware and software. It is bewildering to say the least.

In other words, there is no clear "best" outside of a DGX and Linux software stack. The only way to know anything right now is to test and optimize for what you want to accomplish by running a local llm.