agoose77 (u/agoose77)

agoose77 commented on "AI discourse" is a joke purplesyringa.moe/blog/ai... · Posted by u/bertman

emsign · 2 months ago

What annoys me about AI discourse are two things that in the end never seem to be considered:

1. Who's gonna pay back the investors their trillions of dollars and with what?

2. Didn't we have to start thinking about reducing energy consumption like at least a decade ago?

agoose77 · 2 months ago

I love this angle, and would take it further. I'm starting to think about AI in the same way that we think about food ethics.

Some people are vegan, some people eat meat. Usually, these two parties get on best when they can at least understand each-other's perspectives and demonstrate an understanding of the kinds of concerns the other might have.

When talking to people about AI, I feel much more comfortable when people acknowledge the concerns, even if they're still using AI in their day-to-day.

agoose77 commented on Nanonets-OCR-s – OCR model that transforms documents into structured markdown huggingface.co/nanonets/N... · Posted by u/PixelPanda

jtbayly · 2 months ago

So I’m left to manually link them up?

Have you considered using something like Pandoc’s method of marking them up? Footnotes are a fairly common part of scanned pages, and markdown that doesn’t indicate that a footnote is a footnote can be fairly incomprehensible.

agoose77 · 2 months ago

I am lazily posting this all over the thread, but do check out MyST Markdown too! https://mystmd.org. We handle footnotes as a structured object.

agoose77 commented on Nanonets-OCR-s – OCR model that transforms documents into structured markdown huggingface.co/nanonets/N... · Posted by u/PixelPanda

viraptor · 2 months ago

Do you know why myst got traction, instead of RST which seems to have all the custom tagging and extensibility build in from the beginning?

agoose77 · 2 months ago

MyST Markdown (the MD flavour, not the same-named Document Engine) was inspired by ReST. It was created to address the main pain-point of ReST for incoming users (it's not Markdown!).

As a project, the tooling to parse MyST Markdown was built on top of Sphinx, which primarily expects ReST as input. Now, I would not be surprised if most _new_ Sphinx users are using MyST Markdown (but I have no data there!)

Subsequently, the Jupyter Book project that built those tools has pivoted to building a new document engine that's better focused on the use-cases of our audience and leaning into modern tooling.

agoose77 commented on Nanonets-OCR-s – OCR model that transforms documents into structured markdown huggingface.co/nanonets/N... · Posted by u/PixelPanda

starkparker · 2 months ago

I was more excited to hear about "structured Markdown" than the LLM OCR model, but the extent of it just seems to be tagging certain elements. It's useful in the LLM context but not as much outside of it.

agoose77 · 2 months ago

Feel free to check out MyST Markdown, which very much aims to specify "structured Markdown": https://mystmd.org

agoose77 commented on Nanonets-OCR-s – OCR model that transforms documents into structured markdown huggingface.co/nanonets/N... · Posted by u/PixelPanda

mgr86 · 2 months ago

Understandable. I work in academic publishing, and while the XML is everywhere crowd is graying, retiring, or even dying :( it still remains an excellent option for document markup. Additionally, a lot of government data produced in the US and EU make heavy use of XML technologies. I imagine they could be an interested consumer of Nanonets-OCR. TEI could be a good choice as well tested and developed conversions exist to other popular, less structured, formats.

agoose77 · 2 months ago

Do check out MyST Markdown (https://mystmd.org)! Academic publishing is a space that MyST is being used, such as https://www.elementalmicroscopy.com/ via Curvenote.

(I'm a MyST contributor)

agoose77 commented on Show HN: Wetlands – a lightweight Python library for managing Conda environments arthursw.github.io/wetlan... · Posted by u/arthursw

reedf1 · 3 months ago

Use PDM with the UV backend - this accomplishes this in a much more lightweight and performant way.

agoose77 · 3 months ago

The PyPI ecosystem can not, for the foreseeable future, replicate the scope of the conda ecosystem. From microarch builds to library deduplication, conda is a more general purpose solution. That doesn't mean that one "wins out" (and, for reference I predominantly use Python's PyPI), but they're not the same tools.

agoose77 commented on Please Fund More Science (2020) blog.samaltman.com/please... · Posted by u/ssuds

n2d4 · 3 months ago

That's not true. The pandemic ended as Omicron became the dominant strain, which was by some measures 90% less fatal than Delta.

It's selective breeding; because we became careful about recognizing symptoms, any severe strain would cause the infected to isolate and hence not infect others. Therefore, Omicron was often symptomless, and COVID-19 was no longer deemed as much of a threat.

agoose77 · 3 months ago

I don't disagree with the general vibe here, but a few points:

- It's hard to compare Omicron vs delta because of the number of confounding variables - population heterogeneity, vaccine + infection induced immunity, etc. - Severe strains with latency periods are invulnerable to symptom recognition. I don't think the asymptomatic period for the COVID variants varied as much in the lower bound as it did the upper bound. The point being -- behavioural changes are much more likely to be general caution (i.e. limiting contacts, spacing social events in time, etc.) than responsive (I feel unwell).

agoose77 commented on Modern LaTeX github.com/mrkline/modern... · Posted by u/signa11

agoose77 · 4 months ago

A shameless plug for the MyST Engine https://mystmd.org/

It's a document engine that ingests Markdown (particularly the MyST superset) and builds upon "structured data" for sharing.

E.g. SciPy's proceedings: https://proceedings.scipy.org/articles/XHDR4700

agoose77 commented on A protein folding mystery solved: Study explains core packing fractions phys.org/news/2025-03-pro... · Posted by u/PaulHoule

try_the_bass · 4 months ago

> This is the kind of cool stuff I'm going to miss during the coming dark ages.

I really don't get this level of hyperbole. There's so much hand-wringing about funding getting cut, but it turns out it's like a 15% reduction[0]. That's not an insignificant amount, but it's not the end of the world. Taken naively, that's 15% less research that gets done. One can hope that, being a pillar of academia, the intelligent folks over at Yale can figure out how to spend 15% less on research, so the same amount of research gets done with fewer dollars. Or, better yet, they can put more effort into finding and cutting the rising levels of fraud amongst academic researchers[1].

I think 15% might be too drastic, but at the end of the day, things can't always progress up and to the right, all day every day. If you don't want waste, you sometimes have to cut things, or at the very least apply pressure to them. This mindset of "any cut is bad!" prevents necessary cuts, especially when coupled with this "everyone gets a voice" mindset, simply because you can always find someone to speak up in protection of anything--even fraud! I'd say you'd be surprised by how vigorously people protest their own innocence when they're clearly participating in bad behavior, but like... _gestures at everything_

Don't get me wrong, I think this administration is going about this in mostly the wrong ways, but the problem is, they're doing something those in the affected academic organizations refused to do, namely: applying sufficient adversity to the system to keep it strong.[2] The fact that fraud among scientific research is increasing over time is ample evidence that they're not doing enough to self-police. I don't know how rigorously studied the phenomenon is, but I've certainly seen an increase in popular science coverage of various frauds and scandals in all kinds of scientific fields over the years. Should we really continue paying and promoting the people who are perpetrating this fraud? (As an aside, I wonder how much money is given back to the government when fraud like this is exposed before the grant is fully filled? Or does it usually escape detection until after the grant has been paid out? Anyone know this?)

When you depend on someone else funding your studies, but don't do sufficient legwork to keep things operating smoothly, why is it a seemingly the end of the world for the organization providing the funding to decide to cut it? This is essentially the ruling demographic says: "we think you're wasting our money, so we're going to give you less of it until we see you do better". I think this is a personally reasonable ask! I think the definition of "do better" is troubling in some cases, but this sort of thing should be happening all the time. I don't understand why you and seemingly so many others seem to think that the government shouldn't ever be cutting funding to research programs, especially when the level of waste just keeps going up? You and others constantly hyperbolize a (admittedly large) cut into "oh no it's the end of the world". But it really isn't, and it's not even really an insurmountable challenge. Run a few plagiarism/LLM checks, fire/expel the worst offenders, and you've already saved a significant fraction of the newfound deficit! Yeah, you might destroy some "promising" careers, but look: attempting to deceive the entire world for personal gain (even if just to maintain a basic standard of living!) probably should come with a pretty stiff penalty. The kind of person who would falsify data for personal gain is only promising to do more of the same for their whole career. They're exactly the kind of people that academia should be vigorously expelling.

To look at it from another angle: Academic research needs to be built on a foundation of trust. There will also always be adversaries in the system, and how hard they have to work to stay hidden is dependent on how much oversight there is. If the oversight is lax, adversaries can thrive, which ultimately erodes trust both within the system and without. If academia (as a nebulous whole) is not doing enough internal oversight to keep adversaries in check, then it falls to those outside academia to try affect this oversight. Given the current capitalistic nature of our society, this tends to come in the form of withholding or cutting funding. The more the trust erodes, the stronger the external response, which is what I think we're seeing today. But while a 15% cut might be "too far" or "too much" or "too inaccurate in allocation", consider that part of the reason these cuts are happening is because those "outside the system" have lost trust in the academic system in this country. In response, they did what they could: elected adversaries of the system as it exists today.

And why have the people who support these cuts lost trust in the academic system? Abstractly, I think this boils down to the contrast between this apparent lack of internal oversight and the nature of academia itself: the pursuit of knowledge. Academia literally exists to discover new truths and present them to the rest of the world. It asks the rest of the world to subsidize this learning in various ways, with the promise that the newfound knowledge will vastly repay the subsidy. But when the knowledge the academic system is putting out is increasingly found to actually be bullshit, it repeatedly breaks this promise.

---

Anyway, that's a lot of words to say I think your opinion is wildly hyperbolic and immature. I'm getting tired of folks defending an obviously imperfect system as if every small attack on it is "the end of democracy!" It's not helpful, and it just reinforces the image that folks who hold the same beliefs as yourself are also likely to be equally hyperbolic and immature. It's not a good look.

[0] https://yaledailynews.com/blog/2025/02/10/nih-slashes-indire...

[1] https://retractionwatch.com/2024/09/24/1-in-7-scientific-pap... unsure the quality of this source, but fraud in research is definitely a thing I've been hearing more and more about, especially with generative AI getting let loose on it by folks with... looser morals

[2] https://www.apa.org/topics/resilience

agoose77 · 4 months ago

I want to be careful about what I write given the context of what's going on, and the personal ramifications that can have.

Suffice to say, it's worth considering whether the cost of a decision can be interpreted solely as how much money there is vs the wider ecosystem level consequences of said decision.

agoose77 commented on Germany's Water Consumption Down 17% Following Nuclear Reactor Shutdowns vdi-nachrichten.com/techn... · Posted by u/42lux

myrmidon · 6 months ago

> Yes, in a hypothetical world we can just scale up storage and decentralise production, but what are the timelines and costs on that?

Why a hypothetical world? I think that current timelines, while not particularly awe-inspiring, are quite realistic (Germany: no more coal for electricity within 2038).

I also see no problem in using gas peaker plants provisionally for the next decade, and gradually phasing them out in favor of storage as batteries get even cheaper.

Newly built nuclear power is basically useless by comparison-- construction alone currently easily takes a decade (see: Olkiluoto 3 >15y, Flamanville 3 >15y, Vogtle 3/4 >10y, Shin-Hanul 1/2 >10y), local resistance is very large, costs are astronomical.

ROI for those plants is completely abysmal already and continuously getting worse, because they are completely unable to compete with solar/wind energy prices whenever those are available.

So going "full nuclear" now would mean that all the extremely expensive effort is completely useless (climate-wise) for at least a decade (until first plants finish), while spending the same on solar/wind improves the situation right now (by allowing us to rely on fossils less often), and those projects also tend to finish within years instead of decades, and they don't need astronomical sums (and guarantees) from taxpayers to get financed.

agoose77 · 6 months ago

> I think that current timelines, while not particularly awe-inspiring, are quite realistic (Germany: no more coal for electricity within 2038).

I am not an expert on this, at all... but I'm not sure that's the case. c.f. Wikipedia:

> In March 2024, Federal Audit Office published a report in which it assessed the policy as not meeting goals on a number of points: the planned 80% share of renewable energy requires dispatchable sources but the assumed 10 GW in fossil gas generation is neither sufficient nor on schedule; extension of electric grid is behind the schedule by 6,000 km (3,700 mi) and 7 years; security of the supply chain is not sufficiently assessed; system costs to ensure 24/7 generation are underestimated and based on "best-case" scenarios; capacity installed in renewables is behind the schedule by 30%, whereas demand is expected to grow by 30% as result of electrification of heating and transport

As for

> ROI for those plants is completely abysmal already and continuously getting worse, because they are completely unable to compete with solar/wind energy prices whenever those are available.

That's because the pricing model is arbitrary. If we need nuclear, we can make it economically viable through reforming the way we purchase electricity. But,

> construction alone currently easily takes a decade

is the real problem. Unless SMRs actually materialise _and_ have fast build times, it's just not happening (and realistically, I think _that_ ship has already sailed).

I'm not really making a point here much beyond "it's one thing to say nuclear is no longer viable given our lack of investment" and another to say "it was a good thing to drop nuclear N years ago". You're not saying that for the record. By dropping nuclear, we have to deal with a bigger shortfall and that means gas peakers, etc.