Readit News logoReadit News
joshuakogut commented on How has DeepSeek improved the Transformer architecture?   epoch.ai/gradient-updates... · Posted by u/superasn
ilaksh · a year ago
Why is it that the larger models are better at understanding and following more and more complex instructions. And generally just smarter?

With DeepSeek we can now run on non-GPU servers with a lot of RAM. But surely quite a lot of the 671 GB or whatever is knowledge that is usually irrelevant?

I guess what I sort of am thinking of is something like a model that comes with its own built in vector db and search as part of every inference cycle or something.

But I know that there is something about the larger models that is required for really intelligent responses. Or at least that is what it seems because smaller models are just not as smart.

If we could figure out how to change it so that you would rarely need to update the background knowledge during inference and most of that could live on disk, that would make this dramatically more economical.

Maybe a model could have retrieval built in, and trained on reducing the number of retrievals the longer the context is. Or something.

joshuakogut · a year ago
Yesterday when I started evaluating Deepseek-R1 V3 it was insanely better at code generation using elaborate prompts, I asked it to write me some boilerplate code in python using the ebaysdk library to pull a list of all products sold by user with $name and it spit it out, just a few tweaks and it was ready to go.

I tried the same thing on the 7B and 32B model today, neither are as effective as codellama.

Deleted Comment

joshuakogut commented on Ask HN: Who wants to be hired? (December 2024)    · Posted by u/whoishiring
joshuakogut · a year ago
Location: AZ

Remote: yes

Willing to relocate: no

Technologies: python/django SQL, metabase, C#

resume/cv: https://drive.google.com/file/d/1px-oOulk-JZM9Ir6uh24w6pyNBL...

email: joshuamkogut@gmail.com

joshuakogut commented on Introducing Copilot+ PCs   blogs.microsoft.com/blog/... · Posted by u/skilled
delusional · 2 years ago
The AI is the least interesting part of this announcement. Microsoft is giving ARM another try with a special branding that's supposed to guarantee some level of performance and quality. That's possibly huge news.
joshuakogut · 2 years ago
'AI' is frequently the least interesting part of any announcement.
joshuakogut commented on Llama3 implemented from scratch   github.com/naklecha/llama... · Posted by u/Hadi7546
windowshopping · 2 years ago
Aaaaaaaaaa.org is possibly the worst domain name I've ever encountered in all my time using the internet. I support your mission but you need to change that.
joshuakogut · 2 years ago
While I agree with you, it's easy to remember using a simple rule. A*10
joshuakogut commented on How bad are satellite megaconstellations for astronomy?   leonarddavid.com/blinded-... · Posted by u/belter
ooterness · 2 years ago
No. The moon is tidally locked to the Earth, so its rotation exactly matches its orbital period. If you put a telescope on the far side, it'll always be pointing away from Earth.
joshuakogut · 2 years ago
This is the principle behind the lagrange point chosen for the James Webb Space Telescope. It's occluded from the brightness of the sun by the earth.
joshuakogut commented on How bad are satellite megaconstellations for astronomy?   leonarddavid.com/blinded-... · Posted by u/belter
jonplackett · 2 years ago
Is this a bit of a temporary problem, in the the very technology causing the issue - substantially cheaper access to space - will presumable ultimately put a frikkin massive telescope, or many massive telescopes into orbit where they’ll also not have to deal with the many other issues ground telescopes have to deal with.

I get that this still sucks for any individual with a Telescope though.

joshuakogut · 2 years ago
> put a frikkin massive telescope, or many massive telescopes into orbit

See: Hubble, JWST, and more to come

edit: JWST is in a different point from LEO but still counts

joshuakogut commented on The Pile is a 825 GiB diverse, open-source language modelling data set (2020)   pile.eleuther.ai/... · Posted by u/bilsbie
Der_Einzige · 2 years ago
I came so close to getting my Debate document dataset "DebateSum"[1] included into this[2] and I am very sad that it wasn't included to this day:

[1] https://github.com/Hellisotherpeople/DebateSum [2] https://github.com/EleutherAI/the-pile/issues/56

joshuakogut · 2 years ago
> If you’d like to contribute it, feel free to submit a PR

Stella was waiting for you to submit your dataset. Did you? She closed the ticket many months later.

joshuakogut commented on Firefly III: A free and open source personal finance manager   firefly-iii.org/... · Posted by u/thunderbong
plondon514 · 2 years ago
Thanks for the feedback, it's like a forecasting feature. Say you want to spend over budget today, you can see how that changes your daily budget 1 day, 2 days, 3 days, etc. out from today. Or let's say you plan on being frugal for the next few days and will spend only $20.00 over the next 3 days, by the 4th day your daily budget will adjust accordingly: https://imgur.com/a/hD9R2zv
joshuakogut · 2 years ago
What a great feature! I also had no idea what it meant so for UI/UX reasons I'd add some alt-text explaining that.
joshuakogut commented on Kagi Changelog 2/13: Faster and more accurate instant answers and Wikipedia page   kagi.com/changelog#3179... · Posted by u/goplayoutside
mattbaker · 2 years ago
I’ve been using Kagi full time and I like it a lot. It’s been worth the price.

I expected to like lenses and favoring/blocking specific domains. What I didn’t expect was how much their “Quick Answer” would change how I search.

I’ve been “AI hesitant”, in general the chance that an LLM will hallucinate makes these kinds of tools more trouble than they’re worth for me personally. In Kagi’s case, though, the individual facts it states in the quick answers have citations linking to the site it drew that information from.

Here’s what I’ve found:

- it’s been accurate most of the time, but not 100% (as expected)

- citations are pretty accurate most of the time

- every so often the citation links to a page that seemingly doesn’t back the claim in the quick answer

Unsurprisingly, I don’t trust the AI generated quick answer in isolation, what it does do is let me scan a few paragraphs, find the one that answers my question most specifically, and visit the sites it links to as citations for that piece of the answer. This saves me the time of clicking through the top $N results and scanning each page to find the one that seems to answer my query most directly. It’s like a layer on top of the page rank.

I remember using Google the first time and being impressed how the top answers were so much more relevant than Yahoo, it was a huge time saver. Now I find myself wondering if the “quick answer” citations will prove to be a similar jump in accelerating my ability to find the right web page.

It also makes me wonder if their own page rank algorithm could incorporate the quick answer output as an input to a site’s rank? That would be an interesting experiment!

joshuakogut · 2 years ago
What happens when it crawls some LLM generated text on a website, using that as a citation?

u/joshuakogut

KarmaCake day19May 16, 2022View Original