Readit News logoReadit News
varenc · a year ago
> “...it decided to deselect these dissertations, so that 3.2 km could be freed up for new acquisitions”

Am I reading this correctly and they have 3.2 kilometers of dissertations? What an interesting unit of paper archive size, though it makes sense.

StrangeDoctor · a year ago
I think linear bookshelf distance is a normal unit for talking about collections. At least as informative as number of books. Guessing 15 meters per bookshelf from photos, 214 bookshelves? doesn't sound as cool to me.
Jtsummers · a year ago
3.2km of linear storage space makes sense for books. You aren't just piling them up in stacks, where volume might be a useful measure, and you aren't putting them arbitrarily deep on the same row because that prevents access. You'll usually store things like this one book deep. If you have a 4-row shelf where you could have an 8-row shelf with the same width, each row 1m wide, you have 4m vs 8m of linear storage space.

Deleted Comment

Ekaros · a year ago
About 3 200 000 cm... That is actually surprisingly large number if you assign any number of centimetres for each.
netrus · a year ago
You are of by a factor of 10.
pentamassiv · a year ago
My dad's PhD is listed on Google scholar, but not digitalized. Although I never read it (I don't understand it) I would like it to get preserved. All universities should provide digital copies of their students bachelor's and masters thesis as well as PhDs. Data storage is so cheap these days
sandworm101 · a year ago
>> All universities should provide digital copies of their students bachelor's and masters thesis as well as PhDs

I'm not sure that is healthy, not for undergraduates. I'm all for open access to knowledge, but I question how much knowledge is actually in the average undergraduate thesis. I think a greater danger exists in people being held to things they said while an undergraduate student.

Famously, some of the stuff written by president Obama while he was a law student at Harvard has not been released, nor should it be. We shouldn't hold people for a lifetime to the incorrect, dangerous, or just outright silly stuff they might have said in a papers when they are new to a subject. Putting undergrad work into a perpetual public archive would also have a chilling effect amongst young students who should be enjoying academic freedom. I cannot remember 99% of the stuff I wrote as an undergraduate, but I know that somewhere in there is something horrible that I am glad to have forgotten.

pentamassiv · a year ago
Or we could try to accept that everyone makes mistakes and that's fine. Scientific advancement is basically making slightly fewer mistakes.

My bachelor's thesis was pretty terrible and there probably is not much to learn from it for an expert. It would have been helpful to me to read other peoples thesis when I was a student though and maybe that would have led to a better outcome.

At least here in Germany, a lot of the funding to do the research comes from the government. As a tax payer, I'd like to be able to know the outcome of the research. I am sure there are some real gems in there too.

If a student has reasonable concerns, I would be fine with it not getting published. I believe that the default should be that it gets published.

downWidOutaFite · a year ago
Ha My university (University of Florida) doesn't even keep it's graduation records. They have an error in my 30 year old graduation records but it has been impossible to fix because they don't maintain the records anymore, at some point they outsourced it to a 3rd party who is almost impossible to contact.
daydreamnation · a year ago
logging into a long dormant account to say i went to uf and there were hard copies of masters theses sitting on a shelf in the corner of one of my classrooms dated to the 70s. sounds about right for them to mess up.
pyuser583 · a year ago
There are strict legal rules about educational records.
cycomanic · a year ago
While PhD theses are typically quite straight forward, i.e. at many (most) universities a PhD needs to be a proper publication often with an associated IBAN and with a copyright licence assigned to the University (or at least a number of hard copies given to the University library), masters and bachelor theses differ considerably. Often the copyright fully belongs to the students, they are not required to be published (often even are not supposed to be, as they were done at some industry partner, or results have not been published in journals yet due to time constraints...). So it's legally not that easy for universities to publish or even archive them especially in retrospect.
2Gkashmiri · a year ago
Shodh ganga in India does that on a national level.

https://shodhganga.inflibnet.ac.in:8443/jspui/browse?type=ti...

samspenc · a year ago
I'm guessing most recent dissertations have been digitized, but this is probably the norm only in the last 10-15 years? Most universities likely have never given thought to digitize anything from before then due to the extra costs that would be involved in digitizing those physical copies. I am curious how much such an effort would cost though.
not2b · a year ago
Everything was digital at UC Berkeley back in the early 1990s and before.
sharpshadow · a year ago
There needs to be a global effort to backup the Internet Archive at this point.
esskay · a year ago
Just need to find someone with ~220pb of storage and the ability to increase that by approximately 50% annually forever more.
adastra22 · a year ago
That's only about 38 racks of storage, at a cost of ~$3.5M for the hard drives (redundancy included). Not that big, in the grand scheme of things.
sidewndr46 · a year ago
Whenever you have that much data stored how do you actually know the data is still there and can be retrieved? Even if you have absolutely insane connectivity to it at some point don't you run out of time to check it? Apparent 200 PiB at 1 GiB per second would take about 58254 hours to retrieve.
BSDobelix · a year ago
There is, at least with book's etc:

https://annas-archive.org/torrents

fngjdflmdflg · a year ago
I wonder if this is a large enough catalog for IA to fly out to the Netherlands to ship these in as they do with entire libraries:

>We will be very accepting of materials that you will pack, ship and de-dupe, and we are more selective when we have to pay and coordinate. But we can do this and we have done so for many many collections of items we do not have. For full libraries our Away Team will travel to your location to pack and ship.[0]

See also "Preserving the legacy of a library when a college closes."[1]

[0] https://help.archive.org/help/how-do-i-make-a-physical-donat...

[1] https://blog.archive.org/2019/12/10/preserving-the-legacy-of...

Daviey · a year ago
The British Library which is responsible for hosting our PhD's has been offline for a year following a cyber attack. It's really frustrating how long it is taking them to bring it back, and would really value IA having an archive.
InDubioProRubio · a year ago
That long is a indicator of permanent damage? aka they had one copy and its encrypted and they hope to keep it lowkey..
MichaelZuo · a year ago
The interesting question is why they aren’t expanding their archival storage space. What’s higher priority for any university archives than keeping dissertations?
eesmith · a year ago
These are dissertations from other universities, where the originating university still has a copy.

> The dissertations were originally part of an exchange programme between (mostly European) universities until the year 2004 but were never catalogued on arrival. ... The universities where these dissertations originally were defended informed UBL that they still have the dissertations and were not interested in receiving back the Leiden copy.

hyperbrainer · a year ago
Wonder when the day will arrive when universities decide to offload all archives to online media only, just keeping the most important books and maybe unique manuscripts in libraries.
MichaelZuo · a year ago
Presumably most of the dissertations produced at reputable universities would be valuable enough to keep at least 2 copies in storage.
jampekka · a year ago
Tangential: Archive.org is giving alert popup "Have you ever felt like the Internet Archive runs on sticks and is constantly on the verge of suffering a catastrophic security breach? It just happened. See 31 million of you on HIBP!"
consumer451 · a year ago
Wow, I'm seeing that as well.

Earlier today, I was seeing reports on Bluesky that it was down for a lot of people.

stebalien · a year ago
Possible supply-chain "attack" (or demonstration, from what I can tell) on wherever they get their polyfill library? It's coming from:

https://polyfill.archive.org/v3/polyfill.min.js?features=fet...

renewiltord · a year ago

Deleted Comment

n3uman · a year ago
https://blog.archive.org/2021/02/04/thank-you-ubuntu-and-lin... They openly show a possible vector. "The Internet Archive is wholly dependent on Ubuntu and the Linux communities that create a reliable, free (as in beer), free (as in speech), rapidly evolving operating system. It is hard to overestimate how important that is to creating services such as the Internet Archive." Maybe CUPS?
asynchronous · a year ago
I mean that gives nothing away, if someone compromised Ubuntu the OS they have a lot more targets than IA here.