AFAIK most miners undervolted their cards to maximise profitability vs. power/cooling requirements. Would applying new thermal paste + replacing fans on the cooler be enough?
It's obviously been running fine under high loads, so why would it just decide to stop working fine?
I think the major issue is that many cards have been running outside of specs for a long time. High loads tends to increase the risk of that, so the problem lies in figuring out the case for the card you're interested in.
The replacement method works, but it is indeed subpar. It annoys me that my 3 generation newer phone is a downgrade in this regard, but I can live with it.
1) UDP uses packages/messages which may or may not reach its destination. If it reaches its destination the data is intact. Normally connectionless.
2) TCP is a stream protocol. There is no package/message boundary unless you create it yourself (my tip is to do a simple binary TLV (type length value) protocol using say a fixed 4 byte header). Requires a connection to be setup first.
3) Network byte order - really important to read about.
4) Nagles algorithm (TCP_NODELAY) and SO_KEEPALIVE - those are a couple of things to read about.
5) Start with the simple select() approach to handle the socket activity.
You can then go ahead and get more advanced by doing nonblocking I/O or do blocking I/O with each client in its own thread, figuring out pros and cons for your use case. You can add SSL/TLS on top of your TCP connection etc.
EDIT: The SO_KEEPALIVE part is perhaps least important thing to start reading about. I'm a bit biased due to NAT traversal problems as I wrote a secure remote access solution for a major company several years back, utilising STUN/TURN servers, public key authentication (basically certificate pinning), TLS etc.
I can highly recommend Beej’s guide to network programming: https://beej.us/guide/bgnet/
That together with Linux/BSD man pages should be everything needed, some great documentation there.
That combined with the fact that VAO that came in version 3 is the last feature makes all threats to force you to update weak at best.
As hardware peaks you can stop worrying about new things and write software that never goes bad.
The wheel has not been rediscovered for a long time. Focus on the vehicle instead = bike.
As for the SPI flash size: they are almost always given in Mbit, so 16Mbit is 2MB hence the confusion if I were to guess. You would be looking for a 128Mbit one to get 16MB.
Nice work and keep on tinkering!