refulgentis (u/refulgentis)

refulgentis commented on Bring Back the Blue-Book Exam chronicle.com/article/bri... · Posted by u/diodorus

Why would anyone downvote this thoughtful question?

refulgentis · 7 hours ago

Been here 15 years and there's just more voting based on gut vibe the last year or two. I don't really comment that much anymore after a crazier incident a month ago.

Feels like when /r/nba got too "joke-y" for me to participate. Asking a question without performative vulnerability that sounds fake to most will sound "aggressive" or "putting down" to most in a generic social environment. (i.e. one where loudness / randomness / 'being polite', i.e. not asking questions or inducing cognitive load rule)

Thanks for noticing and saying something. Ngl it made me feel bad like I did something wrong.

refulgentis commented on Bring Back the Blue-Book Exam chronicle.com/article/bri... · Posted by u/diodorus

djoldman · a day ago

It's relatively straightforward to immediately suspend test takers for a semester and expel them on a second infraction.

Administrations won't allow it because they just don't care enough. It's a pain dealing with complaining parents and students.

In any case, cheating has existed since forever. There is nothing much new about cheating in in-class exams now with AI than before without.

refulgentis · a day ago

I am amenable to this argument (am GP), however I do think this is unique.

AI is transformative here, in toto, in the total effect on cheating, because its the first time you can feasibly "transfer" the question in with a couple muscle-memory taps. I'm no expert, but I assume there's a substantive difference between IDing someone doing 200 thumb taps for ~40 word question versus 2.

(part I was missing personally was that they can easily have a second phone. same principle as my little bathroom-break-cheatsheet in 2005 - can't find what's undeclared, they're going to be averse to patting kids down)

refulgentis commented on Bring Back the Blue-Book Exam chronicle.com/article/bri... · Posted by u/diodorus

mlpoknbji · a day ago

The commenters lamenting this trend presumably have not given a takehome assignment to college students in recent years. The problem is huge and in class tests are basically the only way to test if students are learning. Unfortunately this doesn't solve the problem of AI assisted cheating on in-class exams, which is shockingly prevalent these days at least in STEM settings.

refulgentis · a day ago

I'm curious, how is AI assistance on an in class exam even possible? I can't picture how AI changed anything from, say, post-iPhone. i.e. I expect there to be holes in security re: bathroom breaks, but even in 2006 they confiscated cell phones during exams.

I guess what I'm asking is, how did AI shift the status quo for in class exams over, say, Safari?

refulgentis commented on Building A16Z's Personal AI Workstation a16z.com/building-a16zs-p... · Posted by u/ProofHouse

CamperBob2 · 2 days ago

Compared to 4x RTX 6000 Blackwell boards, it's GPU poor. There has to be a reason they want to load up a tower chassis with $35K worth of GPUs, right? I'd have to assume it has strong advantages for inference as well as training, given that the GPU has more influence on TTFT with longer contexts than the CPU does.

refulgentis · 2 days ago

Right - I'd suggest the idea that 128 GB of GPU RAM gives you an 8K context shows us it may be worth revising priors such as "it has strong advantages for inference as well as training"

As Mr. Hildebrand used to say, when you assume, you make...

(also note the article specifically frames this speccing out as about training :) not just me suggesting it)

refulgentis commented on Building A16Z's Personal AI Workstation a16z.com/building-a16zs-p... · Posted by u/ProofHouse

CamperBob2 · 2 days ago

What's the TTFT like on a GPU-poor rig, though, once you actually take advantage of large contexts?

refulgentis · 2 days ago

I guess I'd say, why is the framework perceived as GPU poor? I don't have one but I also don't know why TTFT would be significantly lower than M-series (it's a good GPU!)

refulgentis commented on Building A16Z's Personal AI Workstation a16z.com/building-a16zs-p... · Posted by u/ProofHouse

CamperBob2 · 2 days ago

If you're inferencing, you're not getting much more out of this than you would a ~$2K Framework desktop.

Well, you're getting the ability to maintain a context bigger than 8K or so, for one thing.

refulgentis · 2 days ago

Well, no, at least, we're off by a factor of about 64x at the very least: 64 GB GPU M2 Max/M4 max top out at about 512K context for 20B params, and the Framework desktop I am referencing has 128 GB unified memory.

refulgentis commented on Building A16Z's Personal AI Workstation a16z.com/building-a16zs-p... · Posted by u/ProofHouse

NitpickLawyer · 2 days ago

> If you're inferencing, you're not getting much more out of this than you would a ~$2K Framework desktop.

I was with you up till here. Come on! CPU inferencing is not it, even macs struggle with bigger models, longer contexts (esp. visible when agentic stuff gets > 32k tokens).

The PRO6000 is the first gpu that actually makes sense to own from their "workstation" series.

refulgentis · 2 days ago

Er, CPU inferencing? :) I didn't think I mentioned that!

The Framework Desktop thing is that has unified memory with the GPU, so much like an M-series, you can inference disproportionately large models.

refulgentis commented on Building A16Z's Personal AI Workstation a16z.com/building-a16zs-p... · Posted by u/ProofHouse

chis · 2 days ago

A16Z is consistently the most embarrassing VC firm at any given point in time. I guess optimistically they might be doing “outrage marketing” but it feels more like one of those places where the CEO is just an idiot and tells his employees to jump on every trend.

The funny part is that they still make money. It seems like once you’ve got the connections, being a VC is a very easy job these days.

refulgentis · 2 days ago

It's been such a mind-boggling decline in intellect, combined with really odd and intense conspiratorial behavior around crypto, that I went into a bit a few months ago.

My weak, uncited, understanding from then they're poorly positioned, i.e in our set they're still the guys who write you a big check for software, but in the VC set they're a joke: i.e. they misunderstood carpet bombing investment as something that scales, and went all in on way too many crypto firm. Now, they have embarrassed themselves with a ton of assets that need to get marked down, it's clearly behind the other bigs, but there's no forcing function to do markdowns.

So we get primal screams about politics and LLM-generated articles about how a $9K video card is the perfect blend between price and performance.

There's other comments effusively praising them on their unique technical expertise. I maintain a llama.cpp client on every platform you can think of. Nothing in this article makes any sense. If you're training, you wouldn't do it on only 4 $9K GPUs that you own. If you're inferencing, you're not getting much more out of this than you would a ~$2K Framework desktop.

refulgentis commented on FFmpeg 8.0 ffmpeg.org/index.html#pr8... · Posted by u/gyan

droopyEyelids · 3 days ago

Could you explain more about it? I assumed the maintainers are doing it as part of their jobs for a company (completely baseless assumption)

refulgentis · 3 days ago

Reupvoted you from gray because I don't think that's fair, but I also don't know how much there is to add. As far as why I'm contributing, I haven't been socially involved in the ffmpeg dev community in a decade, but, it is a very reasonable floor to assume it's 80% not full time paid contributors.

refulgentis commented on DeepSeek-v3.1 api-docs.deepseek.com/new... · Posted by u/wertyk

dragonwriter · 4 days ago

> In the modes in APIs, the sampling code essentially "rejects and reinference" any token sampled that wouldn't create valid JSON under a grammar created from the schema.

I thought the APIs in use generally interface with backend systems supporting logit manipulation, so there is no need to reject and reinference anything; its guaranteed right the first time because any token that would be invalid has a 0% chance of being produced.

I guess for the closed commercial systems that's speculative, but all the discussion of the internals of the open source systems I’ve seen has indicated that and I don't know why the closed systems would be less sophisticated.

refulgentis · 4 days ago

I maintain a cross-platform llama.cpp client - you're right to point out that generally we expect nuking logits can take care of it.

There is a substantial performance cost to nuking, the open source internals discussion may have glossed over that for clarity (see github.com/llama.cpp/... below). The cost is very high, default in API* is not artificially lower other logits, and only do that if the first inference attempt yields a token invalid in the compiled grammar.

Similarly, I was hoping to be on target w/r/t to what strict mode is in an API, and am sort of describing the "outer loop" of sampling

* blissfully, you do not have to implement it manually anymore - it is a parameter in the sampling params member of the inference params

* "the grammar constraints applied on the full vocabulary can be very taxing. To improve performance, the grammar can be applied only to the sampled token..and nd only if the token doesn't fit the grammar, the grammar constraints are applied to the full vocabulary and the token is resampled." https://github.com/ggml-org/llama.cpp/blob/54a241f505d515d62...