Readit News logoReadit News
pugio commented on Batch Mode in the Gemini API: Process More for Less   developers.googleblog.com... · Posted by u/xnx
pugio · a month ago
Hah, I've been wrestling with this ALL DAY. Another example of Phenomenal Cosmic Powers (AI) combined with itty bitty docs (typical of Google). The main endpoint ("https://generativelanguage.googleapis.com/v1beta/models/gemi...") doesn't even have actual REST documentation in the API. The Python API has 3 different versions of the same types. One of the main ones (`GenerateContentRequest`) isn't available in the newest path (`google.genai.types`) so you need to find it in an older version, but then you start getting version mismatch errors, and then pydantic errors, until you finally decide to just cross your fingers and submit raw JSON, only to get opaque API errors.

So, if anybody else is frustrated and not finding anything online about this, here are a few things I learned, specifically for structured output generation (which is a main use case for batching) - the individual request JSON should resolve to this:

```json { "request": { "contents": [ { "parts": [ { "text": "Give me the main output please" } ] } ], "system_instruction": { "parts": [ { "text": "You are a main output maker." } ] }, "generation_config": { "response_mime_type": "application/json", "response_json_schema": { "type": "object", "properties": { "output1": { "type": "string" }, "output2": { "type": "string" } }, "required": [ "output1", "output2" ] } } }, "metadata": { "key": "my_id" } } ```

To get actual structured output, don't just do `generation_config.response_schema`, you need to include the mime-type, and the key should be `response_json_schema`. Any other combination will either throw opaque errors or won't trigger Structured Output (and will contain the usual LLM intros "I'm happy to do this for you...").

So you upload a .jsonl file with the above JSON, and then you try to submit it for a batch job. If something is wrong with your file, you'll get a "400" and no other info. If something is wrong with the request submission you'll get a 400 with "Invalid JSON payload received. Unknown name \"file_name\" at 'batch.input_config.requests': Cannot find field."

I got the above error endless times when trying their exact sample code: ``` BATCH_INPUT_FILE='files/123456' # File ID curl https://generativelanguage.googleapis.com/v1beta/models/gemi... \ -X POST \ -H "x-goog-api-key: $GEMINI_API_KEY" \ -H "Content-Type:application/json" \ -d "{ 'batch': { 'display_name': 'my-batch-requests', 'input_config': { 'requests': { 'file_name': ${BATCH_INPUT_FILE} } } } }" ```

Finally got the job submission working via the python api (`file_batch_job = client.batches.create()`), but remember, if something is wrong with the file you're submitting, they won't tell you what, or how.

pugio commented on Peasant Railgun   knightsdigest.com/what-ex... · Posted by u/cainxinth
robocat · 2 months ago
https://www.dndbeyond.com/magic-items/4615-decanter-of-endle...

abridged: speak the command word "Geyser" produces 30 gallons of water that gushes forth in a geyser 30 feet long and 1 foot wide. As a bonus action while holding the decanter, you can aim the geyser at a creature you can see within 30 feet of you. The water stops pouring out at the start of your next turn.

Doesn't seem like the decanter has thrust. To design something that thrusts would require your character to have a deep understanding of D&D physics (or maybe just some deadly experimentation?!)

Trying to mix mundane physics and universe book rules (i.e. peasant railgun) in a D&D universe sounds like a dangerous pastime for a character.

Cue the old SF story where Hitler wants to control a portal to other planets. Version 1 destroyed when other end moved into a red sun creating a radioactive lance, Version 2 destroyed when other end put into an ocean creating high pressure hose.

Do you want your character to magically refine Plutonium, Madame Curie?

pugio · 2 months ago
I said "as a kid" - I am not so young as to have played D&D 5e as a child.

I was referring to 3e, when the "simulation" aspect of the game was more heavily emphasized. See: https://www.dandwiki.com/wiki/SRD:Decanter_of_Endless_Water

> “Geyser” produces a 20-foot-long, 1-foot-wide stream at 30 gallons per round. ... The geyser effect causes considerable back pressure, requiring the holder to make a DC 12 Strength check to avoid being knocked down.

It was that last line that initially sparked the idea. Given the stated effects, this didn't seem like so much of a physics+rules stretch. The no-friction freedom of movement may have been more beyond the pale. Unfortunately 5e deliberately tried to close all the fun ways one could abuse various items.

pugio commented on Peasant Railgun   knightsdigest.com/what-ex... · Posted by u/cainxinth
pugio · 2 months ago
When I was a kid I had a character that could fly. I realized that a Decanter of Endless Water put out a pretty powerful constant thrust. Then a Helmet of Freedom of Movement could be interpreted to remove all excess friction due to win resistance (forget the details but it was something about removing any factor that would inhibit your movement). Constant acceleration and no friction... Unlimited speed.

I actually sat down and worked out all the equations based on the mass of my character and the amount of thrust the decanter provided. Our party would be deep in the wilderness somewhere and I'd say " I nip back to town to pick up some supplies, with acceleration and deacceleration it takes me 17 minutes".

Looking back, I think I was a pretty annoying player, but my DM was very patient. I guess he could see I put a lot of work into the scheme. It was also probably the most exciting application of physics I had encountered in my life so far.

pugio commented on Trying to teach in the age of the AI homework machine   solarshades.club/p/dispat... · Posted by u/notarobot123
johnea · 3 months ago
One of the most offensive words in the anthropomophization of LLMs is: hallucinate.

It's not only an anthropomorphism, it's also a euphemism.

A correct interpretation of the word would imply that the LLM has some fantastical vision that it mistakes for reality. What utter bullsh1t.

Let's just use the correct word for this type of output: wrong.

When the LLM generates a sequence of words, that may or may not be grammatically correct, but infers a state or conclusion that is not factually correct; lets state what actually happened: the LLM generated text was WRONG.

It didn't take a trip down Alice's rabbit hole, it just put words together into a stream that inferred a piece of information that was incorrect, it was just WRONG.

The euphemistic aspect of using this word is a greater offense than the anthropomorphism, because it's painting some cutesy picture of what happened, instead of accurately acknowledging that the s/w generated an incorrect result. It's covering up for the inherent short comings of the tech.

pugio · 3 months ago
When a person hallucinates a dragon coming for them, they are wrong, but we still use a different word to more precisely indicate the class of error.

Not all llm errors are hallucinations - if an llm tells me that 3 + 5 is 7, It's just wrong. If it tells me that the source for 3 + 5 being 7 is a seminal paper entitled "On the relative accuracy of summing numbers to a region +-1 from the fourth prime", we would call that a hallucination. In modern parlance " hallucination" has become a term of art to represent a particular class of error that llms are prone to. (Others have argued that "confabulation" would be more accurate, but it hasn't really caught on.)

It's perfectly normal to repurpose terms and anthropomorphizations to represent aspects of the world or systems that we create. You're welcome to try to introduce other terms that don't include any anthropomorphization, but saying it's "just wrong" conveys less information and isn't as useful.

pugio commented on Show HN: Defuddle, an HTML-to-Markdown alternative to Readability   github.com/kepano/defuddl... · Posted by u/kepano
kepano · 3 months ago
No it's all rules-based. I think the code you're referring to is "extractors", which are website-specific rules that I'm working on to standardize the output from sites with comments threads (e.g. HN, Reddit) and conversational chats (ChatGPT, Claude, Gemini).
pugio · 3 months ago
I would love something which reliably extracted a markdown back/forth from all the main LLM providers. I tried `defuddle` on a shared Gemini URL and it returned nothing but the "Sign In" link. Maybe I'm using your extractor wrong? How are you managing to get the rendered conversation HTML?
pugio commented on How University Students Use Claude   anthropic.com/news/anthro... · Posted by u/pseudolus
pugio · 5 months ago
I've used AI for one of the best studying experiences I've had in a long time:

1. Dump the whole textbook into Gemini, along with various syllabi/learning goals.

2. (Carefully) Prompt it to create Anki flashcards to meet each goal.

3. Use Anki (duh).

4. Dump the day's flashcards into a ChatGPT session, turn on voice mode, and ask it to quiz me.

Then I can go about my day answering questions. The best part is that if I don't understand something, or am having a hard time retaining some information, I can immediately ask it to explain - I can start a whole side tangent conversation deepening my understanding of the knowledge unit in the card, and then go right back to quizzing on the next card when I'm ready.

It feels like a learning superpower.

pugio commented on The case against conversational interfaces   julian.digital/2025/03/27... · Posted by u/nnx
pugio · 5 months ago
> The second thing we need to figure out is how we can compress voice input to make it faster to transmit. What’s the voice equivalent of a thumbs-up or a keyboard shortcut? Can I prompt Claude faster with simple sounds and whistles?

This reminds me of the amazing 2013 video of Travis Rudd coding python by voice: https://youtu.be/8SkdfdXWYaI?si=AwBE_fk6Y88tLcos

The number of times in the last few years I've wanted that level of "verbal hotkeys"... The latencies of many coding llms are still a little bit too low to allow for my ideal level of flow (though admittedly I haven't tried one's hosted on services like groq), but I can clearly envision a time when I'm issuing tight commands to a coder model that's chatting with me and watching my program evolve on screen in real time.

On a somewhat related note to conversational interfaces, the other day I wanted to study some first aid stuff - used Gemini to read the whole textbook and generate Anki flash cards, then copied and pasted the flashcards directly into chat GPT voice mode and had it quiz me. That was probably the most miraculous experience of voice interface I've had in a long time - I could do chores while being constantly quizzed on what I wanted to learn, and anytime I had a question or comment I could just ask it to explain or expound on a term or tangent.

pugio commented on What to Do   paulgraham.com/do.html... · Posted by u/npalli
pugio · 5 months ago
(No implied critique of the actual essay) but when I saw that title from PG, I was really hoping it would address the 2025 question "What should one do now?"

At a time when it seems like so many pursuits or activities or things to make are overshadowed by " but won't there be a model in the next 6 months that can just do this itself?", not to mention all the other present world uncertainties...

Well, it would be nice to hear more thought as to how to focus one's energies.

(I have my own thoughts on this of course, but what I'm really advocating / hoping for is more strong takes on the question.)

pugio commented on The young, inexperienced engineers aiding DOGE   wired.com/story/elon-musk... · Posted by u/medler
fingerlocks · 7 months ago
Sounds like a DEXA scan would be much more appropriate. Less radiation, cheaper, faster, and specificity tailored for measuring body composition. It’s like 40 bucks and five minutes.

Getting an MRI for body composition is like using industrial high precision equipment to measure the length of a hotdog

pugio · 7 months ago
I'm a bit wary of regular DEXA due to the ionizing radiation. MRIs have essentially zero health side-effects if you're not using any contrast agents.

DEXA is definitely cheaper, but a good amount of my time spent in MRIs was due to assisting in various research and QA projects. Unless you're made of money, I wouldn't recommend that to anyone who has to pay. I wish they were cheaper...

pugio commented on The young, inexperienced engineers aiding DOGE   wired.com/story/elon-musk... · Posted by u/medler
slantedview · 7 months ago
Nobody gets an MRI for fun.
pugio · 7 months ago
I do. I think it's interesting to have scans of parts of my body – brain, body fat/muscle distribution, etc. I also use them as reference for how my body changes over the decades.

(EDIT: Nothing to do with medicare or fraudulent billing. Just pushing back on the "for fun" point. I can fall asleep in those things.)

u/pugio

KarmaCake day1032May 6, 2009
About
Focused on (STEM) education technology.

Contact: <HN username> + "x" @ gmail

View Original