Readit News logoReadit News
refulgentis commented on Bring Back the Blue-Book Exam   chronicle.com/article/bri... · Posted by u/diodorus
everybodyknows · 20 hours ago
Why would anyone downvote this thoughtful question?
refulgentis · 7 hours ago
Been here 15 years and there's just more voting based on gut vibe the last year or two. I don't really comment that much anymore after a crazier incident a month ago.

Feels like when /r/nba got too "joke-y" for me to participate. Asking a question without performative vulnerability that sounds fake to most will sound "aggressive" or "putting down" to most in a generic social environment. (i.e. one where loudness / randomness / 'being polite', i.e. not asking questions or inducing cognitive load rule)

Thanks for noticing and saying something. Ngl it made me feel bad like I did something wrong.

refulgentis commented on Bring Back the Blue-Book Exam   chronicle.com/article/bri... · Posted by u/diodorus
djoldman · a day ago
It's relatively straightforward to immediately suspend test takers for a semester and expel them on a second infraction.

Administrations won't allow it because they just don't care enough. It's a pain dealing with complaining parents and students.

In any case, cheating has existed since forever. There is nothing much new about cheating in in-class exams now with AI than before without.

refulgentis · a day ago
I am amenable to this argument (am GP), however I do think this is unique.

AI is transformative here, in toto, in the total effect on cheating, because its the first time you can feasibly "transfer" the question in with a couple muscle-memory taps. I'm no expert, but I assume there's a substantive difference between IDing someone doing 200 thumb taps for ~40 word question versus 2.

(part I was missing personally was that they can easily have a second phone. same principle as my little bathroom-break-cheatsheet in 2005 - can't find what's undeclared, they're going to be averse to patting kids down)

refulgentis commented on Bring Back the Blue-Book Exam   chronicle.com/article/bri... · Posted by u/diodorus
mlpoknbji · a day ago
The commenters lamenting this trend presumably have not given a takehome assignment to college students in recent years. The problem is huge and in class tests are basically the only way to test if students are learning. Unfortunately this doesn't solve the problem of AI assisted cheating on in-class exams, which is shockingly prevalent these days at least in STEM settings.
refulgentis · a day ago
I'm curious, how is AI assistance on an in class exam even possible? I can't picture how AI changed anything from, say, post-iPhone. i.e. I expect there to be holes in security re: bathroom breaks, but even in 2006 they confiscated cell phones during exams.

I guess what I'm asking is, how did AI shift the status quo for in class exams over, say, Safari?

refulgentis commented on Building A16Z's Personal AI Workstation   a16z.com/building-a16zs-p... · Posted by u/ProofHouse
CamperBob2 · 2 days ago
Compared to 4x RTX 6000 Blackwell boards, it's GPU poor. There has to be a reason they want to load up a tower chassis with $35K worth of GPUs, right? I'd have to assume it has strong advantages for inference as well as training, given that the GPU has more influence on TTFT with longer contexts than the CPU does.
refulgentis · 2 days ago
Right - I'd suggest the idea that 128 GB of GPU RAM gives you an 8K context shows us it may be worth revising priors such as "it has strong advantages for inference as well as training"

As Mr. Hildebrand used to say, when you assume, you make...

(also note the article specifically frames this speccing out as about training :) not just me suggesting it)

refulgentis commented on Building A16Z's Personal AI Workstation   a16z.com/building-a16zs-p... · Posted by u/ProofHouse
CamperBob2 · 2 days ago
What's the TTFT like on a GPU-poor rig, though, once you actually take advantage of large contexts?
refulgentis · 2 days ago
I guess I'd say, why is the framework perceived as GPU poor? I don't have one but I also don't know why TTFT would be significantly lower than M-series (it's a good GPU!)
refulgentis commented on Building A16Z's Personal AI Workstation   a16z.com/building-a16zs-p... · Posted by u/ProofHouse
CamperBob2 · 2 days ago
If you're inferencing, you're not getting much more out of this than you would a ~$2K Framework desktop.

Well, you're getting the ability to maintain a context bigger than 8K or so, for one thing.

refulgentis · 2 days ago
Well, no, at least, we're off by a factor of about 64x at the very least: 64 GB GPU M2 Max/M4 max top out at about 512K context for 20B params, and the Framework desktop I am referencing has 128 GB unified memory.
refulgentis commented on Building A16Z's Personal AI Workstation   a16z.com/building-a16zs-p... · Posted by u/ProofHouse
NitpickLawyer · 2 days ago
> If you're inferencing, you're not getting much more out of this than you would a ~$2K Framework desktop.

I was with you up till here. Come on! CPU inferencing is not it, even macs struggle with bigger models, longer contexts (esp. visible when agentic stuff gets > 32k tokens).

The PRO6000 is the first gpu that actually makes sense to own from their "workstation" series.

refulgentis · 2 days ago
Er, CPU inferencing? :) I didn't think I mentioned that!

The Framework Desktop thing is that has unified memory with the GPU, so much like an M-series, you can inference disproportionately large models.

refulgentis commented on Building A16Z's Personal AI Workstation   a16z.com/building-a16zs-p... · Posted by u/ProofHouse
chis · 2 days ago
A16Z is consistently the most embarrassing VC firm at any given point in time. I guess optimistically they might be doing “outrage marketing” but it feels more like one of those places where the CEO is just an idiot and tells his employees to jump on every trend.

The funny part is that they still make money. It seems like once you’ve got the connections, being a VC is a very easy job these days.

refulgentis · 2 days ago
It's been such a mind-boggling decline in intellect, combined with really odd and intense conspiratorial behavior around crypto, that I went into a bit a few months ago.

My weak, uncited, understanding from then they're poorly positioned, i.e in our set they're still the guys who write you a big check for software, but in the VC set they're a joke: i.e. they misunderstood carpet bombing investment as something that scales, and went all in on way too many crypto firm. Now, they have embarrassed themselves with a ton of assets that need to get marked down, it's clearly behind the other bigs, but there's no forcing function to do markdowns.

So we get primal screams about politics and LLM-generated articles about how a $9K video card is the perfect blend between price and performance.

There's other comments effusively praising them on their unique technical expertise. I maintain a llama.cpp client on every platform you can think of. Nothing in this article makes any sense. If you're training, you wouldn't do it on only 4 $9K GPUs that you own. If you're inferencing, you're not getting much more out of this than you would a ~$2K Framework desktop.

refulgentis commented on FFmpeg 8.0   ffmpeg.org/index.html#pr8... · Posted by u/gyan
droopyEyelids · 3 days ago
Could you explain more about it? I assumed the maintainers are doing it as part of their jobs for a company (completely baseless assumption)
refulgentis · 3 days ago
Reupvoted you from gray because I don't think that's fair, but I also don't know how much there is to add. As far as why I'm contributing, I haven't been socially involved in the ffmpeg dev community in a decade, but, it is a very reasonable floor to assume it's 80% not full time paid contributors.
refulgentis commented on DeepSeek-v3.1   api-docs.deepseek.com/new... · Posted by u/wertyk
dragonwriter · 4 days ago
> In the modes in APIs, the sampling code essentially "rejects and reinference" any token sampled that wouldn't create valid JSON under a grammar created from the schema.

I thought the APIs in use generally interface with backend systems supporting logit manipulation, so there is no need to reject and reinference anything; its guaranteed right the first time because any token that would be invalid has a 0% chance of being produced.

I guess for the closed commercial systems that's speculative, but all the discussion of the internals of the open source systems I’ve seen has indicated that and I don't know why the closed systems would be less sophisticated.

refulgentis · 4 days ago
I maintain a cross-platform llama.cpp client - you're right to point out that generally we expect nuking logits can take care of it.

There is a substantial performance cost to nuking, the open source internals discussion may have glossed over that for clarity (see github.com/llama.cpp/... below). The cost is very high, default in API* is not artificially lower other logits, and only do that if the first inference attempt yields a token invalid in the compiled grammar.

Similarly, I was hoping to be on target w/r/t to what strict mode is in an API, and am sort of describing the "outer loop" of sampling

* blissfully, you do not have to implement it manually anymore - it is a parameter in the sampling params member of the inference params

* "the grammar constraints applied on the full vocabulary can be very taxing. To improve performance, the grammar can be applied only to the sampled token..and nd only if the token doesn't fit the grammar, the grammar constraints are applied to the full vocabulary and the token is resampled." https://github.com/ggml-org/llama.cpp/blob/54a241f505d515d62...

u/refulgentis

KarmaCake day4157June 8, 2010
About
jpohhhh.com

jpohhhh@gmail.com

View Original