Readit News logoReadit News
cheeko1234 commented on DeepScaleR: Surpassing O1-Preview with a 1.5B Model by Scaling RL   pretty-radio-b75.notion.s... · Posted by u/sijuntan
simonw · a year ago
Does anyone have any good simple prompts for testing new "reasoning" models like this one?

"Count the letter Rs in the word strawberry" is a bit dull!

I'm trying this one locally using Ollama and the smallest quantized GGUF version (769MB) I could find - https://huggingface.co/bartowski/agentica-org_DeepScaleR-1.5... - I ran it like this:

  ollama run hf.co/bartowski/agentica-org_DeepScaleR-1.5B-Preview-GGUF:IQ3_XXS
Here's the answer it gave me: https://gist.github.com/simonw/5943a77f35d1d5185f045fb53898a... - it got the correct answer after double-checking itself 9 times! And if you look at its thought it made a pretty critical error right at the start:

  "Strawberry" has the letters S, T, R, A, W, B, E, R, F,
  U, R, E. Let me count each 'R' as I go along.
Hopefully the versions that aren't quantized that tiny do better than that.

cheeko1234 · a year ago
I have two. One is a simple one that only deepseek R1 has passed (in my opinion):

I have a 12 liter jug and a 6 liter jug. How do I get exactly 6 liters of water?

Answer (Deepseek): Fill the 6-liter jug completely to obtain exactly 6 liters of water.

Every other LLM I've tried include o3-mini-high: Fill the 12-liter jug completely. Pour it into the 6 liter jug.

Although o3 did get it right in the reasoning: It seems like the user has a 12-liter jug and a 6-liter jug. The simplest answer is to just fill the 6-liter jug directly with water—done! But maybe there's a catch, like needing to use both jugs somehow.

So it knows that the 12 liter jug is mentioned uselessly, but most LLMs HAVE to use the 12 liter jug since it's mentioned in the prompt.

Another Prompt I recently found via perplexity (that deepseek got correct on the first try, but other LLMs can only get it right if I add: "Think step by step before answering."

A 67-year-old woman with congenital bicuspid aortic valve is admitted to the hospital because of a 2-day history of fever and chills. Her current medication is lisinopril. Physical examination reveals a temperature of 38.5°C (101.3°F), blood pressure of 110/70 mmHg, and a heart rate of 100/min. Cardiac auscultation demonstrates a grade 3/6 systolic ejection murmur at the right upper sternal border. Blood cultures are obtained, and empiric antibiotic therapy is initiated. Which of the following is the most likely causative organism? A) Staphylococcus aureus B) Streptococcus viridans C) Escherichia coli D) Pseudomonas aeruginosa E) Candida albicans

cheeko1234 commented on Google Chrome just rolled out a new way to track you and serve ads   theconversation.com/googl... · Posted by u/_cnhi
Aardwolf · 3 years ago
> Additionally, there are features such as Protected Audience that can serve you ads for “remarketing” (for example, Chrome tracked you visiting a listing for a toaster, so now you will get ads for toasters elsewhere), and Attribution Reporting, that gathers data on ad clicks.

Why is this actually effective? When you bought a toaster (which is a likely reason why you visited the listing), why would ads for other toasters still be relevant?

cheeko1234 · 3 years ago
Me visiting a listing for a toaster means that I'm in the market for a toaster, doesn't necessarily imply I already bought one.

Also, with this data, they can potentially build a better model than Amazon's recommender model. Most consumers have a pattern when buying and there's a lot of money in figuring it out.

cheeko1234 commented on Falcon 180B   huggingface.co/blog/falco... · Posted by u/osanseviero
cheeko1234 · 3 years ago
It failed my basic benchmark that only GPT4 passes:

Q: I have a 12 liter jug and a 6 liter jug. I need 6 liters of water. How do I do it?

Falcon180B: First, fill the 6-liter jug completely. Then, pour it into the 12-liter jug. Fill the 6-liter jug again, and pour it into the 12-liter jug until the 12-liter jug is full. At this point, there will be exactly 6 liters of water in the 6-liter jug, which you can then use.

Q: Using only a 6-liter jug, how can I get exactly 6 liters of water?

Falcon180B: You cannot get exactly 6 liters of water in a 6-liter jug because it would already be full. However, if you had another container that could hold at least 6 liters of water, you could follow these steps:

1. Fill the 6-liter jug with water. 2. Pour the water from the 6-liter jug into the other container. 3. Refill the 6-liter jug with water and pour it into the other container. 4. Repeat step 3 until you have poured 6 liters of water into the other container.

This way, you can get exactly 6 liters of water by combining the capacity of the 6-liter jug and the other container.

cheeko1234 commented on Factorio: Space Age   factorio.com/blog/post/ff... · Posted by u/haunter
cheeko1234 · 3 years ago
This day got a lot better! I have thousands of hours but had taken a break after I reached a rocket launch every minute with 100s of trains running around.

I guess this break will be over soon. I look forward to dropping a few more lifetimes playing this amazing game.

cheeko1234 commented on Jupyter AI   jupyter-ai.readthedocs.io... · Posted by u/jonbaer
phillipcarter · 3 years ago
This is probably the more helpful page as it shows what you can do: https://jupyter-ai.readthedocs.io/en/latest/users/index.html...

This is a nice feature! Not huge, but it's great DevEx (MLEngEx...?)

cheeko1234 · 3 years ago
Thanks for the link although for some reason I keep getting:

Incorrect API key provided error on jupyter for chatgpt. I'm on a paid account so not sure why...

Deleted Comment

cheeko1234 commented on Experiencing decreased performance with ChatGPT-4   community.openai.com/t/ex... · Posted by u/SmartVA
cheeko1234 · 3 years ago
I use the following test to ensure I'm on GPT4 and not 3.5. (I noticed that it did fail at this test temporarily and then got it. Not sure why. Maybe it reverts back to 3.5 when under load?)

I have a 12 liter jug and a 6 liter jug. I want to measure 6 liters. How do I do it?

GPT4: You actually don't need to do anything because one of your jugs is already a 6-liter jug. If you fill it up to the top, you'll have exactly 6 liters of water.

GPT-3.5: To measure exactly 6 liters using a 12-liter jug and a 6-liter jug, you can follow the steps below:

Start with both jugs empty. Fill the 12-liter jug completely with water. Pour the water from the 12-liter jug into the 6-liter jug. This will leave you with 6 liters of water in the 12-liter jug. Empty the 6-liter jug. Pour the 6 liters of water from the 12-liter jug back into the empty 6-liter jug. Now, you have 6 liters of water in the 6-liter jug. At this point, you have successfully measured 6 liters using the 12-liter jug and the 6-liter jug.

cheeko1234 commented on How to control your smart home without yelling at a dumb voice assistant   arstechnica.com/gadgets/2... · Posted by u/zdw
jon-wood · 3 years ago
It appears we're doing "smart home" again, so its time for my regular plea to stop calling shitty light switches automation. Automation is when you don't have to think about what state devices are in, and while it takes some tinkering can be achieved using something like Home Assistant to bridge various different devices together.

Each room in my house has Hue bulbs in all the light sockets, and sensors which can detect ambient light levels and motion. If I walk into a room and it's dark, the lights turn on, because I probably want to be able to see what I'm doing. When they stop detecting motion they turn off.

There's some additional complexity in rooms where its normal to occupy them for an extended period without moving about much. The bathroom has a button I can press which disable motion detection for an hour so I can have a bath without being plunged into darkness ten minutes after getting in. Some day I'll get round to replacing that with a leak sensor that picks up there being water in the bath.

The living room will ignore the motion sensor and assume it's occupied when the TV is playing media, or has it's input set to a games console, and the bedroom knows when I typically go to bed and once it detects motion around that time it doesn't turn off the lights until I tell Siri I'm going to sleep, at which point the motion sensor turns off until my alarm goes off in the morning.

Finally, the Home Assistant app is tracking when I enter and leave a geofence around my home. When I leave, everything turns off. When I get back, it turns back on again.

The end result of all this is that I don't ever really think about lighting, and because its all automated I can also use it for subconscious prompts about time of day - for the lighting in my work area shifts from a quite stark daylight tone to a much softer one at the end of the work day, while all the lighting dims itself as bedtime approaches.

cheeko1234 · 3 years ago
I have home assistant running with quite a few esp32's running esphome with random sensors/bulbs scattered around the house.

My question is, what do you use the manage the automations? Do you use the native home assistant automation, or node-red, etc? I've even looked a bit into Room assistant[0]

[0] https://www.room-assistant.io/guide/#how-it-works

cheeko1234 commented on Ask HN: What are some of the best documentaries you've seen?    · Posted by u/rasulkireev
asdfqwertzxcv · 4 years ago
Fires of Kuwait - https://www.youtube.com/watch?v=-tUzkF_Dp8M

A first hand account of the firefighters who were sent to cap the oil wells that Saddam set on fire during his attempt to invade the country.

Senna https://www.youtube.com/watch?v=-aPLOQeSHrM The story of famed F1 driver. I'm not a racing fan but this one was amazing.

Apollo 11 https://www.youtube.com/watch?v=3Co8Z8BQgWc Completely new footage was found of the complete Apollo 11 mission. The best doc of the space race, by far.

cheeko1234 · 4 years ago
I would also recommend Moon Machines[0] for an amazing in-depth experience of the Apollo program:

Part 1: The Saturn V Rocket

Part 2: The Command Module

Part 3: The Navigation Computer

Part 4: The Lunar Module

Part 5: The Space Suit

Part 6: The Lunar Rover

Available on vimeo[1] and youtube.

[0] https://en.wikipedia.org/wiki/Moon_Machines

[1] https://vimeo.com/673970849

u/cheeko1234

KarmaCake day407February 5, 2013View Original