snorkel (u/snorkel) - Readit News

snorkel commented on Is-even-ai – Check if a number is even using the power of AI npmjs.com/package/is-even... · Posted by u/modinfo

snorkel · 4 months ago

Lacks an isVeryEven() method, otherwise looks feature complete.

snorkel commented on Teaching a new way to prevent outages at Google sre.google/stpa/teaching/... · Posted by u/motxilo

snorkel · 6 months ago

In other words STPA is a design review framework for finding some less obvious failure modes. FMEA is more popular but relies on making a list of all of the knowable failure modes in a system, but the failure modes you haven’t thought of don’t make it on the list. STPA helps fill in some of those gaps of failure modes you haven’t thought of.

snorkel commented on Notebooks Are McDonalds of Code yobibyte.github.io/notebo... · Posted by u/sebg

snorkel · a year ago

A notebook is a REPL with an inline wiki. Of course it is not intended for running production code, it’s just an R&D environment to document, share, and test ideas

snorkel commented on Scientists Reveal Why You Should Never Take Pebbles from the Beach sciencealert.com/scientis... · Posted by u/laktak

underlogic · a year ago

feeble-minded indeed. what are they criminalizing beach pebbles for? lot of that in the UK. I wonder if the cause is cultural, genetic or environmental. They just don't seem to be playing with a full deck over there. One non sequitur after the next. I would suggest widespread brain damage from covid, but this has been going on for decades. There's something seriously wrong in the common reasoning and logic.

snorkel · a year ago

Article mentions people were raiding beaches for free building materials like sand and stone for concrete, which would be a problem at a large volume. Enforcing the same rules on individual souvenir collectors seems excessive.

snorkel commented on Steve Albini has died pitchfork.com/news/steve-... · Posted by u/coloneltcb

snorkel · a year ago

Way too soon! He recently did an interview with Dave and Krist from Nirvana about the 30th anniversary of In Utero. They described recording prank phone calls during the recording sessions.

snorkel commented on Better and Faster Large Language Models via Multi-Token Prediction arxiv.org/abs/2404.19737... · Posted by u/jasondavies

deskamess · a year ago

Side track: There is so much going on this space. I wish there was a chronological flow of a machine learning scenario/story with all the terms being introduced as we meet them (data, pre-training, training, inference, mixture of experts, RAG). Like someone walking me through a factory explaining what happens at each stage (like Mr Rogers used to do). Most of the time I do not know where the terms fit in the big picture. When I first came across pre-training I thought it was something done to the data before training happened but it was actually another training.

snorkel · a year ago

Strongly recommend watching Andrej Karpathy’s “Lets build GPT-2” videos on YouTube which dives into an actual PyTorch implementation, then download the code and study it carefully. Then study “Spreadsheets is all you need” to see what the internal data structures look like.

snorkel commented on Memary: Open-Source Longterm Memory for Autonomous Agents github.com/kingjulio8238/... · Posted by u/james_chu

CuriouslyC · a year ago

While I'm 100% on board with RAG using associative memory, I'm not sure you need Neo4J. Associative recall is generally going to be one level deep, and you're doing a top K cut so even if it wasn't the second order associations are probably not going to make the relevance cut. This could be done relationally, and then if you're using pg_vector you could retrieve all your rag contents in one query.

snorkel · a year ago

LLMs have a limited context size, i.e. the chat bot can only recall so much of the conversation. This project is building a knowledge graph of the entire conversation(s), then using that knowledge graph as a RAG database.

snorkel commented on 1.18k drawings of plant root systems images.wur.nl/digital/col... · Posted by u/bookofjoe

duckmysick · a year ago

Absolutely fascinating stuff, love it!

If you're doing any kind of gardening, you can find the root system of many common plants and weeds. Some examples:

- bindweed: https://images.wur.nl/digital/collection/coll13/id/193/rec/1 the white meaty rhizomes go at about 10 cm but the actual roots can go much much deeper (2.2 meters in this example)

- horsetail: https://images.wur.nl/digital/collection/coll13/id/753/rec/1 also grows deep roots; extremely sturdy - resists many herbicides and can spread through spores.

- goutweed: https://images.wur.nl/digital/collection/coll13/id/1435/rec/... dense network of thin roots

- dandelion: https://images.wur.nl/digital/collection/coll13/id/676/rec/2 this example has roots reaching 4.5 meters!

- potato: https://images.wur.nl/digital/collection/coll13/id/1014/rec/... was looking for a tomato plant but found this instead (they are the same genus); you can see the tubers too.

- carrot: https://images.wur.nl/digital/collection/coll13/id/1049/rec/... the edible taproot is not the only part

snorkel · a year ago

The dandelion root is 450 cm?! That explains why pulling up the sprouts does nothing to prevent it from sprouting again.

snorkel commented on Show HN: Beyond text splitting – improved file parsing for LLMs github.com/Filimoa/open-p... · Posted by u/serjester

zby · a year ago

What I want is a dynamic chunking - I want to search a document for a word - and then I want to get the largest chunk that fits into my limits and contain the found word. Has anyone worked on such thing?

snorkel · a year ago

OpenSearch perhaps? The search query results returns a list of hits (matches) with a text_entry field that has the matching excerpt from the source doc

snorkel commented on How Chain-of-Thought Reasoning Helps Neural Networks Compute quantamagazine.org/how-ch... · Posted by u/amichail

stavros · a year ago

I think that chain-of-thought for LLMs is just helping them enhance their "memory", as it puts their reasoning into the context and helps them refer to it more readily. That's just a guess, though.

snorkel · a year ago

That’s pretty much correct. An LLM is often used rather like a forecast model that can forecast the next word in a sequence of words. When it’s generating output it’s just continuously forecasting (predicting) the next word of output. Your prompt is just providing the model with input data to start forecasting from. The prior output itself also becomes part of the context to forecast from. The output of “think about it step-by-step” becomes part of its own context to continue forecasting from, hence guides its output. I know that “forecasting” is technically not the right term, but I’ve found it helpful to understand what it is LLM‘s are actually doing when generating output.