dpw (u/dpw) - Readit News

dpw commented on How we use HashiCorp Nomad blog.cloudflare.com/how-w... · Posted by u/jen20

jeffbee · 6 years ago

"... here is the CPU usage over a day in one of our data centers where each time series represents one machine and the different colors represent different generations of hardware. Unimog keeps all machines processing traffic and at roughly the same CPU utilization."

Still a mystery to me why "balancing" has SO MUCH mindshare. This is almost certainly not the optimal strategy for user experience. It is going to be much better to drain traffic away from older machines while newer machines stay fully loaded, rather than running every machine at equal utilization factor.

dpw · 6 years ago

I'm an engineer at Cloudflare, and I work on Unimog (the system in question).

You are right that even balancing of utilization across servers with different hardware is not necessarily the optimal strategy. But keeping faster machines busy while slower machines are idle would not be better.

This is because the time to service a request is only partly determined by the time it takes while being processed on a CPU somewhere. It's also determined by the time that the request has to wait to get hold of a CPU (which can happen at many points in the processing of a request). As the utilization of a server gets higher, it becomes more likely that requests on that server will end up waiting in a queue at some point (queuing theory comes into play, so the effects are very non-linear).

Furthermore, most of the increase in server performance in the last 10 years has been due to adding more cores, and non-core improvements (e.g. cache sizes). Single thread performance has increased, but more modestly.

Putting those things together, if you have an old server that is almost idle, and a new server that is busy, then a connection to the old server will actually see better performance.

There are other factors to consider. The most important duty of Unimog is to ensure that when the demand on a data center approaches its capacity, no server becomes overloaded (i.e. its utilization goes above some threshold where response latency starts to degrade rapidly). Most of the time, our data centers have a good margin of spare capacity, and so it would be possible to avoid overloading servers without needing to balance the load evenly. But we still need to be confident that if there is a sudden burst of demand on one of our data centers, it will be balanced evenly. The easiest way to demonstrate that is to balance the load evenly long before it becomes strictly necessary. That way, if the ongoing evolution of our hardware and software stack introduces some new challenge to balancing the load evenly, it will be relatively easy to diagnose it and get it addressed.

So, even load balancing might not be the optimal strategy, but it is a good and simple one. It's the approach we use today, but we've discussed more sophisticated approaches, and at some point we might revisit this.

dpw commented on Guédelon Castle en.wikipedia.org/wiki/Gu%... · Posted by u/bane

dpw · 6 years ago

Wow, unusual topic for the front page of HN. I visited about 5 years ago. It was my wife's idea to go (it's quite far from other attractions, so you have to plan a visit), but we both really enjoyed it. The castle is the highlight of course, but there's quite a lot more to the site than that, and it's easy to spend a full day there.

dpw commented on However improbable: The story of a processor bug blog.cloudflare.com/howev... · Posted by u/dsr12

paradroid · 8 years ago

@cloudflare I took the photo in your lede :)

dpw · 8 years ago

Thank you! We often use on Creative Commons-licensed images in our blog posts. We always include credit, but we owe a big debt of gratitude to the people who take these photos and make them available.

dpw commented on Cloudflare discolors the Web pwmon.org/p/5470/cloudfla... · Posted by u/pdknsk

dpw · 9 years ago

I work for Cloudflare.

Thanks for bringing this bug to our attention. We have just rolled out a fix. You might need to go into the CF dashboard and purge the cache for your site to see the fix take effect.

dpw · 9 years ago

I spoke too soon! We're going to disable this change for a while. Sorry.

dpw commented on Cloudflare discolors the Web pwmon.org/p/5470/cloudfla... · Posted by u/pdknsk

dpw · 9 years ago

I work for Cloudflare.

Thanks for bringing this bug to our attention. We have just rolled out a fix. You might need to go into the CF dashboard and purge the cache for your site to see the fix take effect.

dpw commented on Qanat en.wikipedia.org/wiki/Qan... · Posted by u/baghali

dpw · 10 years ago

I saw lots of these in Morocco, between the mountains and the Sahara.

Well, what I saw were the regularly spaced mounds of earth at the top of the access shafts.

dpw commented on Intel Is Preparing a Major Restructuring of Their Graphics Driver phoronix.com/scan.php?pag... · Posted by u/buserror

dpw · 10 years ago

"As of yet I don't have a clear picture what this new driver will look like once evolved besides hearing 'boxes, many fucking boxes mate', when being told about the increased abstractions of the multi-OS-focused driver design."

If only more discussions about software architecture were that honest.

dpw commented on Docker Goes native on non-Linux OS with latest beta releasemanagement.org/201... · Posted by u/kiyanwang

dpw · 10 years ago

So "native" means "not VirtualBox" now? Docker for Mac might be a significant step forward compared to the previous solutions, but it still involves a Linux VM. I guess you could say that xhyve is a native hypervisor, but that's a bit weaselly.

dpw commented on Benchmarking Message Queue Latency bravenewgeek.com/benchmar... · Posted by u/tylertreat

dpw · 10 years ago

A few things about the article that made me think "hmmmm":

No mention of testing set-up. Was the test client running on a different machine from the server? What kind of machines? What kind of network?

Many of the charts have the same "ballooning" shape, despite measuring very different systems. I think this is due to the "attempt to correct coordinated omission by filling in additional samples". As I understand it, all charts but the first have this correction applies (and it does sound like it is applied by manipulating the data, not by altering the measuring method). To understand the effect this might have, imagine testing a system that has a single request queue by making requests on a regular schedule, say at 1ms intervals. And most of the time, these take much less than 1ms. But one request is an outlier and takes 100ms. What will the "corrected" results look like? The worst case will by 100ms. The second worst case will be 99ms. The third worst case will be 98ms, etc. On a linear horizontal scale, this would give us a linear slope at the right hand side of the chart. Change to a logarithmic horizontal scale, and you get a chart with the shape seen in many of the charts in this article. This makes it impossible to tell whether the worst cases are due to a small number of outliers or not. I believe that the correction is well-meaning, but I think the uncorrected results would be more informative.

The use of line charts is a bit odd. They are connected to the origin, which is obviously a fiction. They are also slightly smoothed - where steps are visible, the steps have a gradient rather than being a vertical line. Where the number of data points is low, this leads to odd effects: In the two 1MB charts, the right third of the chart is just showing the value of a single data point! A scatter plot might give the reader a more honest impression.

The logarithmic horizontal scale of those charts tends to focus attention on the worst cases. That's not unreasonable - in some contexts, that's what you really care about. But outliers might occur due to environmental effects like kernel scheduling, VM scheduling, dropped packets on a noisy network etc., unless you make an effort to prevent such things. And it makes it very hard to see the typical values on the charts for RabbitMQ and Kafka where the range of Y values is large. Can you tell what the median latency for RabbitMQ/Kafka for any message size is? It looks like about 0.5ms to me, but it's hard to read it from any of the charts.

The number of messages involved is different for different message sizes. You can see that from the way the 1MB charts are stepped, but the charts for smaller message sizes are smoothed. For 1MB messages, it look like there are 5k or 10k samples on the charts. For the smaller message sizes, probably far more. Were all the tests run for roughly the same amount of time? Tests run for longer might see more outliers due to the environment.

"The 1KB, 20,000 requests/sec run uses 25 concurrent connections". With the implication that other test runs had different levels of concurrency. So what were they? What was the impact of changing the concurrency levels while the message size/rate was constant?

Is it possible that the client program making the measurements was introducing any artefacts (for example, being written in Go, did it encounter any GC pauses?). It would be interesting to see the the results of measurements against a simple TCP echo server, as a control.

My criticisms may seem too harsh. It is too much to expect someone to expect weeks doing rigorous measurements, and the resulting article would be so long that hardly anyone would read all of it (sounds like academia!). Someone might say that I should do my own experiments if I think I can do them better; but I have a day job too. I don't want to discourage the author; I think it is good that the author did the work he did, and put it up for everyone to see. But when articles like this get linked on HN and read by lots of people, they can easily get regarded as conclusive. Ideas about the performance of various projects get established that might not be well-founded and can take years to dispel. So all I'm saying is, reader beware!

dpw commented on Rocket Fiber Launches 100GB/s Internet Service in Downtown Detroit techcrunch.com/2015/11/12... · Posted by u/prostoalex

dpw · 10 years ago

The linked article states 100 Gb/s, i.e. gigabits per second. But the post title says "100GB/s" which suggests gigabytes per second. Worth correcting, because getting 100GB/s between two machines in the same rack would be quite an achievement today.