arjun810 (u/arjun810)

arjun810 commented on Show HN: Async – Claude code and Linear and GitHub PRs in one opinionated tool github.com/bkdevs/async-s... · Posted by u/wjsekfghks

_1tem · 4 months ago

I've been planning to build something like this for a while now (just for myself). Love the planning workflow, will likely steal that idea.

But code review is more than just reviewing diffs. I need to test the code by actually building and running it. How does that critical step fit in to this workflow? If the async runner stops after it finishes writing code, do I then need to download the PR to my machine, install dependencies, etc. to test it? Major flow blocker for me, defeats the entire purpose of such a tool.

I was planning to build always-on devcontainers on a baremetal server. So after Claude Code does its thing, I have a live, running version of my app to test alongside the diffs. Sort of like Netlify/Vercel branch deploys, but with a full stack container.

Claude Code also works far better in an agentic loop when it can self-heal by running tests, executing one-off terminal commands, tailing logs, and querying the database. I need to do this anyway. For me, a mobile async coding workflow needs to have a container running with a mobile-friendly SSH terminal, database viewer, logs viewer, lightweight editor with live preview, and a test runner. Diffs just don't cut it for me.

I do believe that before 2025 is over we will achieve the dream of doing real software engineering on mobile. I was planning to build it myself anyway.

arjun810 · 4 months ago

We had exactly the same desire and built it as well, with a nice mobile UI and live app previews. Would love to get your feedback — let me know how to contact you if you’re curious.

arjun810 commented on Show HN: Accelerated Docker builds on your local machine with Depot (YC W23) · Posted by u/jacobwg

arjun810 · 3 years ago

We’ve also been happily using Depot.

arjun810 commented on The Average Student Does Not Exist blog.gradescope.com/the-a... · Posted by u/ibrahima

jimhefferon · 8 years ago

The article's contention is that on-the-ground teachers expect that two people coming out of a high school Algebra II with C+'s are similar.

arjun810 · 8 years ago

This is exactly what we think is a fairly common attitude -- thanks for stating it so clearly! It has ramifications both within a single class and when you think about how prerequisite and dependent classes are structured.

arjun810 commented on The Average Student Does Not Exist blog.gradescope.com/the-a... · Posted by u/ibrahima

dhfhduk · 8 years ago

You brought a smile to my face. I came here to post this same point.

The piece is kind of making a basic fundamental mistake in measurement, assuming that all variability is meaningful variability.

There are ways of making the argument they're trying to make, but they're not doing that.

Also, sometimes a single overall score is useful. A better analogy than the cockpit analogy they use is clothing sizing. Yes, tailored shirts, based on detailed measurements of all your body parts, fit awesome, but for many people, small, medium, large, x-large, and so forth suffice.

I think there's a lesson here about reinventing the wheel.

I appreciate the goals of the company and wish them the best, but they need a psychometrician or assessment psychologist on board.

arjun810 · 8 years ago

I do agree that applying psychometrics would be great, but it's not as simple as it sounds -- the vast majority of work is on multiple choice questions, or binary correct/incorrect. There is some on free response, but much less.

We aren't trying to make a rigorous statement here -- we're trying to draw attention to the fact that the most common metrics do not give much insight into what a student has actually shown mastery of. This is especially important when you consider that the weightings of particular questions are often fairly arbitrary.

I certainly agree that all variability is not meaningful variability, but I'd push back a bit and say that there's meaningful variability in what's shown here. We'll go into more depth and hopefully have something interesting to report.

I've also seen a fair number of comments stating that this is not a surprising result. I'd agree (if you've thought about it), but if you look at what's happening in practice, it's clear that either many people would be surprised by this, or are at least unable to act on it. We're hoping to help with the latter.

arjun810 commented on The Average Student Does Not Exist blog.gradescope.com/the-a... · Posted by u/ibrahima

closed · 8 years ago

To be honest, I like that this article tries to perform simple analyses, but find their rationale pretty confusing.

This kind of data is commonly modeled using item response theory (IRT). I suspect that even in data generated by a unidimensional IRT model (which they are arguing against), you might get the results they report, depending on the level of measurement error in the model.

Measurement error is the key here, but is not considered in the article. That + setting an unjustified margin of 20% around the average is very strange. An analogous situation would be criticizing a simple regression, by looking at how many points fall X units above/below the fitted line, without explaining your choice of X.

arjun810 · 8 years ago

Totally agree that this is not a fully rigorous analysis, and we do want to dig deeper and try to extend some IRT models to these types of questions.

The main point of this post is to highlight that the most common metric of student performance may not be that useful. Most of the time, students will get their score, the average score, and sometimes a standard deviation as well. As jimhefferon mentioned in a response to a different comment, the conventional wisdom is that two students with the same grade know roughly the same stuff, and that's seeming not to be true.

We're hoping to build some tools here to help instructors give students a better experience by helping them cater to the different groups that are present.

disclaimer: I'm one of the founders of Gradescope.

arjun810 commented on UC Berkeley to remove public legacy libraries news.berkeley.edu/2017/03... · Posted by u/wazanator

mankash666 · 9 years ago

We're talking access to course material and the like, largely funded by the state. I'm personally not opposed to restricting access, but as a resident of CA directly funding the UC system, I should be able to view the digitized course material, specially in an era of almost zero-cost distribution methods like YouTube.

arjun810 · 9 years ago

39% of core operations are funded by the state.

https://en.m.wikipedia.org/wiki/University_of_California_fin...

arjun810 commented on Humans, the Latest MOOC Feature insidehighered.com/news/2... · Posted by u/danso

inputcoffee · 9 years ago

I agree with that, but would you say that the "trend", if there is one, is towards providing that through AI, or more humans?

If I recall, they had computers grade the GMAT essays for at least 10 years, but they had to have a human in the loop because that is the ultimate measure for whether a computer is "correct" in terms of grading an essay.

arjun810 · 9 years ago

The trend is towards neither, I'd say. The biggest trend has been towards online automatically graded multiple choice & short answer questions that aren't really open ended.

The automated essay grading stuff typically looks at writing style more than content, but it's true that it's a problem that tons of people have worked on and there's been some cool progress there as well. We're not really working on bringing AI to essay grading ourselves though.