almeria (u/almeria) - Readit News

almeria commented on I moderate /r/kafka; people mistake it as a subreddit about kafka the product twitter.com/LitAnscombe/s... · Posted by u/mooreds

curryst · 4 years ago

I've worked at some places that used Kafka (including LinkedIn), although I have never been responsible for running the platform itself. I'll chip in with what I see as the negatives.

Kafka sits at roughly the same tier as HTTP, but lacks a lot of the convention we have around HTTP. There's a lot of convention around HTTP that allows people to build generic tooling for any apps that use HTTP. Think visibility, metrics, logging, etc, etc. Those are all things you effectively get for free with HTTP in most languages. Afaict, most of that doesn't exist for Kafka in a terribly helpful. You can absolutely build something that will do distributed tracing for Kafka messages, but I'm not aware of a plug-and-play version like there are for most languages.

The fact that Kafka messages are effectively stateless (in the UDP sense, not the application sense) also trips up a lot of people. If you want to publish a message, and you care what happens to that message downstream, things get complicated. I've seen people do RPC over event buses where they actually want a response back, and it became this complicated system of creating new topics so the host that sent the request would get the response back. Again, in HTTP land, you'd just slap a loadbalancer in front of the app and be done. HTTP is stateful, and lends itself to stateful connections.

Another issues it that when you tell people that they can adjust their schema more often, they tend to go nuts. Schemas start changing left and right, and suddenly you now need a product to orchestrate these schema changes and ensuring you're using the right parser for the right message. Schema validation starts to become a significant hurdle.

It's also architecturally complicated to replace HTTP. An HTTP app can be just a single daemon, or a few daemons with a load balancer or two in front. Kafka is, at minimum, your app, a Kafka daemon, and a Zookeeper daemon (nb I'm not entirely sure Zookeeper is still required). You also have to deal with eventual consistency, which can make coding and reasoning about bugs dramatically harder than it needs to be. What happens when Kafka double-delivers a message?

My pitch is always that you shouldn't use Kafka unless it becomes architecturally simpler than the alternatives. There are problems to which Kafka is a better solution than HTTP, but they don't start with unstable schemas or databases being difficult. Huge volumes of data is a good reason to me, not being sure what your downstreams might be is an option. There are probably more, I'm not an expert.

> our customers don't understand the data they're shoving at us. But Kafka will take care of all of that for us

Kafka isn't going to help with this at all. If your HTTP app can't parse it, neither will your Kafka app. Kafka does have the ability to do replays, but so does shoving the requests in S3 or a databases for processing later. I promise you that "SELECT * FROM requests WHERE status='failed'" is drastically simpler than any Kafka alternative. It is neat that Kafka lets you "roll back time" like that, but you have to very carefully consider the prospect of re-processing the messages that already succeeded. It's very easy to get a bug where you have double entries in databases or other APIs because you're reprocessing a request.

almeria · 4 years ago

Very helpful, thanks

almeria commented on I moderate /r/kafka; people mistake it as a subreddit about kafka the product twitter.com/LitAnscombe/s... · Posted by u/mooreds

zwkrt · 4 years ago

My take was humorous but it didn’t hide anything. Kafka was built so that LinkedIn could shove all its real-time click data through a single funnel—terabytes upon terabytes. It has since been evangelized and created a cottage industry of Confluent salespeople who will give your manager a course in how to lobby their engineers into using Kafka. Have scaling problems? Kafka. Have business events that need to be ordered? Kafka! Have “changing schemas”? KAFKA!! I’m always suspicious when a company gives a product away for free tbut then charges $$$ for “support”.

I worked for a high profile recently-failed project from a company that rhymes with Brillo, and our data was just beginning to be too big for google sheets (!). However, we were also having organizational problems because the higher ups were seeing the failing project losing money so they of course decided to hire 100 extra engineers. Our communications (both human and programmatic) were failing and the confluent salespeople began circling like buzzards. Of course by the time it was suggested we we use it the project was already 6 months past the point of no return.

My advice is that if your data fits in a database, use a database. Anyone who says that isn’t scalable should have to tell you the actual reason it doesn’t scale and the number of requests/users/GBs/uptime/ etc that is the bottleneck.

almeria · 4 years ago

Thank you. I am now enlightened.

almeria commented on UC slams the door on standardized admissions tests, nixing any SAT alternative latimes.com/california/st... · Posted by u/djkivi

eesmith · 4 years ago

I take it you haven't consulted the references I listed second-hand? They are likely far more insightful than anything I can write here.

Before making my earlier comment, I read the start of the the Hiss and Franks paper is at https://web.archive.org/web/20140310113612/http://www.nacacn... to make sure the citation I gave wasn't misrepresenting the topic (it wasn't). Here's text from the abstract:

It "examines the outcomes of optional standardized testing policies in the Admissions offices at 33 public and private colleges and universities, based on cumulative GPA and graduation rates."

That is, UC isn't the first to do this, and we can look at real-world evidence from previous universities where test scores were optional, to gauge how useful test scores are and what effect they have.

It found: "Few significant differences between submitters and non-submitters of testing were observed in Cumulative GPAs and graduation rates, despite significant differences in SAT/ACT scores."

That is, SAT/ACT scores don't seem to affect metrics like Cumulative GPAs and graduation rates. (As I recall from elsewhere, they do correlate with first year grades, but that's a different and less important metric.)

Further: "Optional testing policies also help build broader access to higher education: non-submitters are more likely to be first-generation-to‐college students, minorities, Pell Grant recipients, women and students with Learning Differences."

That is, using SAT/ACT scores in the selection process appears to have measurable effect on the student population; reducing what is sometimes referred to as "diversity."

As to your correct observation, "it does not follow that predicts(A) > predicts(A ∪ B)", one of the issues is that college success is also correlated with other factors, including parental wealth. And success on the SAT/ACT is also correlated with parental wealth, who can afford special training on how to pass those tests.

We know this by looking at the early history of college boards, which emphasized the topics taught at prep schools (like Latin grammar) than public school, because college admissions preferred rich white male Protestants, who were likely to go to prep school.

To be clear, the SAT has done a lot of work to de-bias their tests, and I don't know enough about to topic to say anything what factors are actually involved.

But I don't need to, since the tests don't seem to be that effective in predicting college success.

almeria · 4 years ago

Thanks for the detailed and nuanced response.

Which, if I can attempt to boil it down to one sentence, would seem to be: "Tests don't seem to be a good predictor of college success, even when taking GPA into account. But they do seem to promote diversity."

And to which I would like to add, solely from anecdotal observation[1]:

"They also provide a second chance for a significant number of people. Even with a bad year, or even two, in high school -- this doesn't necessarily mean you are doomed to a lifetime of minimum wage servitude. If you have the intrinsic cognitive skills, you can easily do moderately well on the SAT, even without prepping."

I would like to consider diversity and the possibility of a second chance to be not merely incidental, but essential benefits to be striven for in the admissions process. Instead of simply stack-ranking based on the most easily demonstrable predictors of success.

[1] Including observations of some of the smartest and most creative and inspiring people I have ever met -- but again, that's anecdotal.

almeria commented on I moderate /r/kafka; people mistake it as a subreddit about kafka the product twitter.com/LitAnscombe/s... · Posted by u/mooreds

zwkrt · 4 years ago

Comment I made last summer:

It’s rare a piece of tech has a more fitting name! “Is your orgs politics so complicated that direct team-to-team communication has broken down? Is your business process subject to unannounced violent change? Bogged down by consistent DB schemas and API versioning? Tired of retries on failed messages? Introducing Kafka by Apache: an easy to use, highly configurable, persistently stored ephemeral centralized federated event based messaging data analytics Single Source of Truth as a Service that can scale with any enterprise-sized dysfunction!”

almeria · 4 years ago

Anything more about Kafka you can tell us?

Seriously, the one time I was in a situation where much of the team seemed hellbent on this "just put all in Kafka" idea (without really understanding why, exactly) the arguments they came up with were not too dissimilar from what you've shared with us above. It all seemed to come down to "OMG databases are hard, schemas are hard, our customers don't understand the data they're shoving at us. But Kafka will take care of all of that for us. Because, you know, shiny."

That said I'd still like to have a more ... balanced understanding of why Kafka may not necessarily be The Answer, and/or have more hidden complexity or other negative tradeoffs than we may have bargained for.

almeria commented on UC slams the door on standardized admissions tests, nixing any SAT alternative latimes.com/california/st... · Posted by u/djkivi

eesmith · 4 years ago

Please clarify. My understanding is that it's the opposite.

Eg, quoting https://journals.sagepub.com/doi/pdf/10.3102/0013189X2090211...

> Yet, the emphasis on test scores over grades in policy and practice recommendations stands in contrast to research showing high school grade point averages (HSGPAs) are stronger predictors than test scores of college outcomes (Bowen et al., 2009; Geiser & Santelices, 2007; Hiss & Franks, 2014; Kobrin et al., 2008).

almeria · 4 years ago

And what about predictions based on GPA and test scores?

As has been in use at UC and just about everywhere else, up until now.

In math terms: just because predicts(A) > predicts(B), it does not follow that predicts(A) > predicts(A ∪ B).

almeria commented on What every IT person needs to know about OpenBSD Part 3: That packet filter blog.apnic.net/2021/11/11... · Posted by u/zdw

hypertele-Xii · 4 years ago

Every IT person doesn't need to know about OpenBSD at all. Clickbait title.

I'm an IT person, know nothing about OpenBSD (except that it's an OS), and am doing just fine.

almeria · 4 years ago

Agreed, and you made an excellent call there.

If network ops is your bag, then yeah, maybe you "ought to" know about what's in the article.

Otherwise -- it's just pure FOMO-based clickbait.

almeria commented on The Age of AI by Henry Kissinger, Eric Schmidt and Daniel Huttenlocher economist.com/books-and-a... · Posted by u/helsinkiandrew

ch4s3 · 4 years ago

When was Kissinger's judgement ever good?

almeria · 4 years ago

Depends what you mean by "good".

For the ruthlessly cynical and basically destructive ends toward which this man has devoted his life - his judgement is arguably quite effective.

That is, after all, how you get to be "America's preeminent living statesman".

almeria commented on The Age of AI by Henry Kissinger, Eric Schmidt and Daniel Huttenlocher economist.com/books-and-a... · Posted by u/helsinkiandrew

boomboomsubban · 4 years ago

Really hard to believe that a CEO of a company with the motto "don't be evil" could write a book with Kissinger. It's unsurprising that the book's main push seems to be for more AI war research as a method of saving lives.

almeria · 4 years ago

Did we ever have any reason to believe that slogan?

almeria commented on Amazon workers were left 'terrified and powerless' after concealed Covid cases news.sky.com/story/covid-... · Posted by u/jdkee

almeria · 4 years ago

The $500k fine is of course a complete joke, considering the seriousness of the matter at hand.

But "as the company enjoyed booming and historic sales with its stock price doubling, Amazon failed to adequately notify warehouse workers and local health agencies of COVID-19 case numbers", the state's attorney general Rob Bonta said.

"This left many workers understandably terrified and powerless to make informed decisions to protect themselves and to protect their loved ones."

Amazon lurkers out there - especially L6 and above - do you have anything to say about this? Anything at all?

Or are you so used to hearing such patent bullshit from your higher-ups (of the sort pasted below) that it just ... goes through you?

Nothing personal - I'd just really like to know.

"This settlement is solely about a technicality specific to California state law surrounding the structure of bulk employee COVID-related notifications," the spokesperson added.

"There's no change to, or allegations of any problems with, our protocols for notifying employees who might have been in close contact with an affected individual.

"We've worked hard from the beginning of the pandemic to keep our employees safe and deliver for our customers - incurring more than $15bn (£11bn) in costs to date - and we'll keep doing that in months and years ahead," they added.

almeria commented on You need to write in order to think well twitter.com/paulg/status/... · Posted by u/DantesKite

zaptheimpaler · 4 years ago

No you don't.

almeria · 4 years ago

The fact that you don't see a need to offer any justification for that statement ... proves PG's point.