aarmenaa (u/aarmenaa)

aarmenaa commented on When employees feel slighted, they work less penntoday.upenn.edu/news/... · Posted by u/consumer451

hypeatei · 24 days ago

Maybe I'm a bit jaded, and corporate environments have taken their toll, but I see the employee-manager relationship as adversarial by default. Whether my boss wishes me happy birthday or not doesn't move the needle much. I'm there to contribute as an individual and he's there to answer to his boss about staffing, budgeting, and performance.

Although, I do feel slighted when a manager acknowledges the absurdity of all the corporatisms we hear everyday then proceeds to preach them to everyone and waste time anyway. Like, please, I thought we just agreed this is all fluff.

aarmenaa · 24 days ago

Yeah, I am always going to interpret that sort of thing as performative. There seems to be whole corporate mythology that is absolutely sure there are a bunch of cheap, low-effort things managers can do to raise morale and get more productivity out of employees, like office birthday parties. I propose a name for adherents to this philosophy: the Pizza Party Cult.

aarmenaa commented on Objects should shut up dustri.org/b/objects-shou... · Posted by u/gm678

pdevine · 6 months ago

Go to a modern hospital emergency room, it's a cacophony of devices all vying for attention. I walked down the hallway and realized every room in the place had a different audible alarm—all active! I suspected the device manufacturers were all worried about liability for their device, making sure to notify that a patient had a problem. The end result for the medical staff was an endless chaos of noise. Complete systemic failure of UX from a practical standpoint.

aarmenaa · 6 months ago

Yes. I have a family member that has had many hospital stays over the last few years, and one of the most obnoxious things is that the staff just lets everything beep. The last time we were in the emergency room the blood pressure monitor did not work and the staff didn't notice for over an hour. Even when it does work, they're constantly in an alarm state because patient has chronic high blood pressure. They either can't or won't silence the alarms, so every room is beeping, the nurse's station is beeping, their phones are beeping, and it's all being ignored. It's the very definition of alert fatigue.

aarmenaa commented on Log by time, not by count johnscolaro.xyz/blog/log-... · Posted by u/JohnScolaro

stackskipton · 7 months ago

As SRE/DevOps/Ops whatever, I'm screaming.

Metrics should be emitted in separate stream and never by logs outside corner cases. Logs should be used to determine WHY the system is having issues but never IS the system having issues.

Log alerting is a fools errand that looks like a great idea at start but quickly becomes a sand trap that will drive future people crazy and at scale, will overwhelm systems.

Why is log alerting bad idea?

Every log becomes a metric point that must be dealt with. Therefore, the logging system must be kept operational and error free. However, due to other problems below, this system quickly becomes a beast of it's own.

Logs are generally much bigger then KV of <Metric> <Value> so there ends up being a ton of filtering going on in logging system, adding to the load.

Logging system probably does not understand rates so you end up writing gnarly queries to be like "Is this first unhandled exception?" in 10m or my 50th in 10m. Query in Prometheus is much much simpler.

Each language logging library handles things in different way so organization must be on point to either A) Keep log format the same between all different languages. B) Teach the logging system how to manipulate each log into format that can be handled by alerting system. Obviously A causes massive developer friction and B causes massive Ops friction.

Finally, I find people doing logging tend not handle exceptions as well because they can just trust logging system to alert them on specific problem and deal with it manually.

So for future Ops person who has to deal with your code, I'm begging you, import prometheus_client.

aarmenaa · 7 months ago

I've noticed that for some reason developers really like using logs in place of actual metrics. We use Datadog, and multiple times now I have seen devs add additional logging to an application just so they can then create a monitor that counts those log events. I think it's a path of least resistance thing; emitting logs is very easy, and counting them is also very easy. Reporting actual metrics isn't really difficult either, but unless you're already familiar with the system it's more effort to determine how to do it than just emitting a log line, so yeah.

aarmenaa commented on Life Altering PostgreSQL Patterns mccue.dev/pages/3-11-25-l... · Posted by u/thunderbong

aarmenaa · a year ago

I like most of these patterns and have used them all before, but a word of caution using UUID primary keys: performance will suffer for large tables unless you take extra care. Truly random values such as UUIDv4 result in very inefficient indexing, because values don't "cluster." For databases, the best solution is to use a combination of a timestamp and a random value, and there are multiple implementations of UUID-like formats that do this. The one I'm familiar with is ULID. It's become a common enough pattern that UUIDv7 was created, which does exactly this. I don't know if it's possible to generate UUIDv7 in Postgres yet.

aarmenaa commented on Optimizing Ruby's JSON, Part 4 byroot.github.io/ruby/jso... · Posted by u/jeremy_k

danielbln · a year ago

Hackernews has a surprising amount of Ruby news over the large few months. As Ruby is my first true (=production) love, I'm here for it.

I've spent the last years in Python land, recently heavily LLM assisted, but I'm itching to do something with Ruby (and or Rails) again.

aarmenaa · a year ago

Maybe some people don't remember anymore, but there was a time when Ruby was HN's favorite language. I miss those days. I kind of get why everybody leaned into Python instead, but I'm never going to be happy about it.

aarmenaa commented on The CD Pipeline Manifesto manifesto.getglu.dev/... · Posted by u/bullcitydev

mqus · a year ago

I don't think I agree. I've now seen the 'language' approach in jenkins and the static yaml file approach in gitlab and drone. A lot of value is to be gained if the whole script can be analysed statically, before execution. E.g. UI Elements can be there and the whole pipeline is visible, before even starting it.

It also serves as a natural sandbox for the "setup" part so we can always know that in a finite (and short) timeline, the script is interpreted and no weird stuff can ever happen.

Of course, there are ways to combine it (e.g. gitlab can generate and then trigger downstream pipelines from within the running CI, but the default is the script. It also has the side effect that pipeline setup can't ever do stuff that cannot be debugged (because it's running _before_ the pipeline) But I concede that this is not that clear-cut. Both have advantages.

aarmenaa · a year ago

If you manage to avoid scope creep then sure, static YAML has advantages. But that's not usually what happens, is it? The minute you allow users to execute an outside program -- which is strictly necessary for a CI/CD system -- you've already lost. But even if we ignore that, the number of features always grows over time: you add variables so certain elements can be re-used, then you add loops and conditionals because some things need to happen multiple times, and then you add the ability to do math, string manipulation is always useful, and so on. Before you know it you're trying to solve the halting problem because your "declarative markup" is a poorly specified turing-complete language that just happens to use a YAML parser as a tokenizer. This bespoke language will be strictly worse than Python in every way.

My argument is that we should acknowledge that any CI/CD system intended for wide usage will eventually arrive here, and it's better that we go into that intentionally rather than accidentally.

aarmenaa commented on The CD Pipeline Manifesto manifesto.getglu.dev/... · Posted by u/bullcitydev

aarmenaa · a year ago

FTA:

> The Fix: Use a full modern programming language, with its existing testing frameworks and tooling.

I was reading the article and thinking myself "a lot of this is fixed if the pipeline is just a Python script." And really, if I was to start building a new CI/CD tool today the "user facing" portion would be a Python library that contains helper functions for interfacing with with the larger CI/CD system. Not because I like Python (I'd rather Ruby) but because it is ubiquitous and completely sufficient for describing a CI/CD pipeline.

I'm firmly of the opinion that once we start implementing "the power of real code: loops, conditionals, runtime logic, standard libraries, and more" in YAML then YAML was the wrong choice. I absolutely despise Ansible for the same reason and wish I could still write Chef cookbooks.

aarmenaa commented on Understanding Round Robin DNS blog.hyperknot.com/p/unde... · Posted by u/hyperknot

tetha · a year ago

> As you can see, all clients correctly detect it and choose an alternative server.

This is the nasty key point. The reliability is decided client-side.

For example, systemd-resolved at times enacted maximum technical correctness by always returning the lowest IP address. After all, DNS-RR is not well-defined, so always returning the lowest IPs is not wrong. It got changed after some riots, but as far as I know, Debian 11 is stuck with that behavior, or was for a long time.

Or, I deal with many applications with shitty or no retry behavior. They go "Oh no, I have one connection refused, gotta cancel everything, shutdown, never try again". So now 20% - 30% of all requests die in a fire.

It's an acceptable solution if you have nothing else. As the article notices, if you have quality HTTP clients with a few retries configured on them (like browsers), DNS-RR is fine to find an actual load balancer with health checks and everything, which can provide a 100% success rate.

But DNS-RR is no loadbalancer and loadbalancers are better.

aarmenaa · a year ago

True. On the other hand, if you control the clients and can guarantee their behavior then DNS load balancing is highly effective. A place I used to work had internal DNS servers with hundreds of millions of records with 60 second TTLs for a bespoke internal routing system that connected incoming connections from customers with the correct resources inside our network. It was actually excellent. Changing routing was as simple as doing a DDNS update, and with NOTIFY to push changes to all child servers the average delay was less than 60 seconds for full effect. This made it easy to write more complicated tools, and I wrote a control panel that could take components from a single server to a whole data center out of service at the click of a button.

There were definitely some warts in that system but as those sorts of systems go it was fast, easy to introspect, and relatively bulletproof.

aarmenaa commented on TCP over TCP is a bad idea (2000) web.archive.org/web/20230... · Posted by u/Deeg9rie9usi

amaccuish · a year ago

Ye I think there's either more to the story or a misconfiguration. I've done WireGuard at Azure, Hetzner, and AWS. All work fine.

aarmenaa · a year ago

It is very difficult to misconfigure Wireguard -- there's just not that much to tune aside from MTU. We've had a 1 Gbps tunnel between AWS and OVH for years and it worked mostly fine, except for the handful of times OVH's DDOS mitigation kicked in and killed the tunnel. The issue is when you start wanting to go beyond 1 Gbps.

I think AWS will do 5 Gbps with a capable peer -- which is their limit for a single flow [1] -- but you might need to tell them first so they don't kill public networking on the instance though. I found that UDP iperf tests reliably got my instance's internet shut off, so keep that in mind. On the other hand, OVH will happily do 5-ish Gbps to/from my EC2 instance in a TCP iperf test, but won't tolerate more than 1 Gbps of inbound UDP. OVH support has indicated that this is expected, though they do not document that limitation and it seemed that both their support and network engineering people were themselves unaware of that limit until we complained. They don't seem to have the same limits on ESP, which is why I developed an interest in ipsec arcana.

[1] https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/ec2-inst...

aarmenaa commented on TCP over TCP is a bad idea (2000) web.archive.org/web/20230... · Posted by u/Deeg9rie9usi

eqvinox · a year ago

> strongSwan cannot be configured to do unencapsulated ESP anymore -- they removed the option

wait, what? Pretty sure I still used unencapsulated ESP a few months ago… though I wouldn't necessarily notice if it negotiates UDP after some update I guess… starts looking at things

Edit: strongswan 6.0 Beta documentation still lists "<conn>.encap default: no" as config option — this wouldn't make any sense if UDP encapsulation was always on now. Are you sure about this?

aarmenaa · a year ago

Sorry, I misremembered the issue. Looking at my notes the issue is they don't allow disabling their NAT-T implementation, which detects NAT scenarios and automatically forces encapsulation on port 4500/udp. The issue is that every public IP on an EC2 instance is a 1:1 NAT IP. Every packet sent to the public IP is forwarded to the private IP -- including ESP -- but it is technically NAT and looks like NAT to strongSwan.

There's an issue open for years; it will probably never be fixed:

https://wiki.strongswan.org/issues/1265