Readit News logoReadit News
jedbrown · 10 years ago
Does anyone have numbers on memory bandwidth and latency?

The x1 cost per GB is about 2/3 that of r3 instances, but you get 4x as many memory channels if spec the same amount of memory via r3 instances so the cost per memory channel is more than twice as high for x1 as r3. DRAM is valuable precisely because of its speed, but the speed itself is not cost-effective with the x1. As such, the x1 is really for the applications that can't scale with distributed memory. (Nothing new here, but this point is often overlooked.)

Similarly, you get a lot more SSDs with several r3 instances, so the aggregate disk bandwidth is also more cost-effective with r3.

sun_n_surf · 10 years ago
Not sure I quite understand your math here. The largest R3 instance is the r3.8xlarge with 244 GB of memory. 4 times of that would only get you to 1 TB. Also, this: "DRAM is valuable precisely because of its speed", is wrong (https://en.wikipedia.org/wiki/Dynamic_random-access_memory).
jedbrown · 10 years ago
1. 4 of those R3 instances cost less than the X1 but offer nearly double the bandwidth. The X1 is cheaper per GB, but much more expensive per GB/s.

2. If DRAM was not faster than NVRAM/SSD, nobody would use it. "Speed" involves both bandwidth and latency. Latency is probably similar or higher for the X1 instances, but I haven't seen numbers. We can make better estimates about realizable bandwidth based on the system stats.

lovelearning · 10 years ago
This is probably a dumb question, but what does the hardware of such a massive machine look like? Is it just a single server box with a single motherboard? Are there server motherboards out there that support 2 TB of RAM, or is this some kind of distributed RAM?
zokier · 10 years ago
For example Dell sells 4U servers straight out of their webshop which max out at 96x32GB (that's 3TB) of RAM with 4 CPUs (max 18 cores/CPU => 72 cores total). They seem to have some (training?) videos on youtube that show the internals if you are curious:

https://www.youtube.com/watch?v=vS47RVrfBvE main system board

https://www.youtube.com/watch?v=_poMPOUGRa0 memory risers

schlarpc · 10 years ago
Don't know what hardware AWS is using, but Ark has server boards supporting 1.5TB, which is close enough to make 2TB believable: http://ark.intel.com/products/94187/Intel-Server-Board-S2600...

Edit: Supermicro has several 2TB boards, and even some 3TB ones: http://www.supermicro.com/products/motherboard/Xeon1333/#201...

(Disclaimer: AWS employee, no relation to EC2)

yuhong · 10 years ago
This would require expensive 64GB DDR4 LR-DIMMs though.
technologia · 10 years ago
We have some supermicros that have about 12TB RAM, but the built in fans sound like a jumbo jet taking off so consider the noise pollution for a second there.
jsmthrowaway · 10 years ago
Er, are you summing a TwinBlade chassis? You have to be.

6TB is about where single machines currently top out due to the hardware constraints of multiple vendors and architecture, and memory bandwidth starts being an issue. You have to throw 96x64GB at the ones that exist so wave buh bye to a cool half a million USD or so. If you're sitting on a 12TB box I want a SKU (I want one!).

I don't actually think Supermicro makes a 6TB SKU, even. That's Dell and HP land.

cbg0 · 10 years ago
> Are there server motherboards out there that support 2 TB of RAM

Sure, http://www.supermicro.com/products/motherboard/Xeon/C600/X10... supports 3TB in a 48 x 64GB DIMM configuration.

ereyes01 · 10 years ago
Once upon a time I hacked on the AIX kernel which ran on POWER hardware (I think they're up to POWER8 or higher now). In my time there the latest hardware was POWER7-based. It maxed out at 48 cores (with 4-way hyperthreading giving you 192 logical cores) and a max of I think 32TB RAM. Not the same hardware as mentioned in the OP, but pretty big scale nonetheless.

This shows a logical diagram of how they cobble all these cores together: http://www.redbooks.ibm.com/abstracts/tips0972.html?Open

I've seen these both opened up and racked up. They are basically split into max 4 rackmount systems, each I think was 2U IIRC. The 4 systems (max configuration) are connected together by a big fat cable, which is the interconnect between nodes in the Redbook I've linked above. The RAM was split 4 ways among the nodes, and NUMA really matters in these systems, since memory local to your nodes is much faster to access than memory across the interconnect.

This is what I observed about 5-6 years ago. I'm sure things have miniaturized further since then...

dekhn · 10 years ago
yeah, sure, you can get a quad xeon 2U server with 2TB of RAM for around $40K. Here's a sample configurator: https://www.swt.com/rq2u.php change the RAM and CPUs to your preference and add some flash.
rconti · 10 years ago
No insight into what Amazon uses, but we've got HP DL980s (g7s, so they're OLD) with 4TB of RAM) and just started using Oracle x5-8 x86 boxes with 6TB of RAM 8 sockets. I believe 144 cores/288 threads.
eip · 10 years ago
http://www.thinkmate.com/system/rax-xt24-4460-10g

4 CPU, 60 cores, 120 threads (cloud cores), 3TB RAM, 90TB SSD, 4 x 40GB Ethernet, 4 RU. $120K.

Same price as the AWS instance for one year of on demand.

rodgerd · 10 years ago
I can stick 1.5 TB and two sockets in blades right now. Blades. Servers can carry a lot more, amd it's not even especially expensive.
lovelearning · 10 years ago
Yeah, just realized my knowledge of server hardware is hopelessly outdated. They seem to be a couple of orders of magnitude more powerful than what I assumed was available.
zymhan · 10 years ago
4 physical CPUs and 1.9TB of RAM is doable in a 4U server for sure, and possibly in a 2U. So, it just looks like a big server.
lossolo · 10 years ago
Intel processor support up to 1536 GB of ram so basically 1.5 TB per processor.
wyldfire · 10 years ago
How flipping awesome is it that some very large portion (90% or so?) could probably all be one nice contiguous block of mine from x86_64 userspace with a quick mmap() and mlockall().
rzzzt · 10 years ago
I think I have picked this up from an earlier thread discussing huge servers: http://yourdatafitsinram.com/

One of the links on the top points to a server with 96 DIMM slots, supporting up to 6 TB of memory in total.

mbesto · 10 years ago
IDK about AWS, but for SAP HANA, this is done via blades. I've seen 10 TB+.
KSS42 · 10 years ago
My guess is that it is not really DRAM but flash memory on a DIMM like this product form Diablo Technology:

http://www.diablo-technologies.com/memory1/

fra · 10 years ago
Your guess is wrong. It's DRAM plain and simple.
MasterScrat · 10 years ago
As a reference the archive of all Reddit comments from October 2007 to May 2015 is around 1 terabyte uncompressed.

You could do exhaustive analysis on that dataset fully in memory.

jedberg · 10 years ago
Your point is accurate, but I'd like to point out that the dataset isn't actually all the comments on Reddit -- it's inly what they could scrape, which is limited to 1000 comments per account. So basically it's missing a lot of the historical comments of the oldest accounts.

I only point this out to try and correct a common error I see. You're absolutely right that it is awesome that the entire data set can be analyzed in RAM!

MasterScrat · 10 years ago
Are you sure? The dataset is from here: https://archive.org/details/2015_reddit_comments_corpus

Looking at the thread from the release, I see no explanation of how he got the data, but I see several people commenting that they finally have a way to get comments beyond the 1000 per account: https://www.reddit.com/r/datasets/comments/3bxlg7/i_have_eve...

samstave · 10 years ago
It would be interested to see the distribution of 1000 comments from each account over the period of 12 months. Some people go dormant - like vacation, or depression, or lack of interest in topics - then cluster a bunch of comments when, say, they are on a drunken rage binge.

What time of day the accounts most frequently comment. (I'd bet there is an interesting grouping of those that post while at work during the day, and those who post from home at night.

or what subreddits people comment in most during the day vs which /r/ they post to at night ;)

ers35 · 10 years ago
You may be interested in an SQLite version of the dataset that is 553 GB vs. the 908 GB JSON: https://archive.org/details/2015_reddit_comments_corpus_sqli...

The storage format of a dataset can make a big difference in memory usage.

flamedoge · 10 years ago
I would like to know how much of that is memes and shitposts
ChuckMcM · 10 years ago
That is pretty remarkable. One of the limitations of doing one's own version of mass analytics is the cost of acquiring, installing, configuring, and then maintaining the hardware. Generally I've found AWS to be more expensive but you get to "turn it on, turn it off" which is not something you can do when you have to pay monthly for data center space.

It makes for an interesting exercise to load in your data, do your analytics, and then store out the meta data. I wonder if the oil and gas people are looking at this for pre-processing their seismic data dumps.

ddorian43 · 10 years ago
Why does everyone(really!) compare aws to colocation ? I've never head an aws-believer ever mention dedicated servers.

Why don't you compare aws to building your own cpu?

ChuckMcM · 10 years ago
I suspect it is because "everyone" (which is to abscond with your definition) believes that colocation is an alternative to AWS (well the EC2 part anyway). I would be interested to hear how you seem them as not being comparable.

On your definition of "aws-believer" is that someone who feels that AWS is a superior solution to deploying a web facing application in all cases? Does your definition include economics? (like $/month vs request/month vs latency?)

Can I assume that you consider comparing AWS to building your own CPU as an apples to oranges comparison? I certainly do, because I define a CPU to be a small component part of a distributed system hosting a web facing application.

samstave · 10 years ago
Just curious - but wouldnt the GPU based instances be more efficient for oil and gas people?

Or load a data set in this monster and then use GPU workers to hit it?

koolba · 10 years ago
GPUs work when the data is small and the calculation can be parallelized. Random access to memory from a GPU would be slow. It's more like a separate computer (or lots of separate computers) that you can send a small program to execute and get the result.
biot · 10 years ago
More food for thought: how many neurons + synapses can one model with that amount of RAM?
dastbe · 10 years ago
Are you using seismic to describe what the data is, how big the data is, or both?
ChuckMcM · 10 years ago
Here is a good link: http://www.seismicsurvey.com.au/

Basically you take waves that are transiting the area of interest and do transforms on them to ascertain the structure underground. Dave Hitz of NetApp used to joke these guys have great compression algorithm, they can convert a terabyte of data into 1 bit (oil/no-oil).

One of the challenges is that the algorithms are running in a volume of space, so 'nearest neigbor' in terms of samples has more than 8 vectors.

In the early 2000's they would stream their raw data off tape cartridges into a beowulf type cluster, process it, and then store the post processed (and smaller) data to storage arrays. Then that post processed data would go through its own round of processing. One of their challenges was that they ended up duplicating the data on multiple nodes because they needed it for their algorithm and it was too slow to fetch it across the network.

A single system image with a TB of memory would let them go back to some of their old mainframe algorithms which, I'm told, were much easier to maintain.

1024core · 10 years ago
Spot instances are about $13 - $19/hr, depending on zone. Not available in NorCal, Seoul, Sydney and a couple of other places.
snewman · 10 years ago
Do you mean on-demand instances? The announcement says "Spot bidding is on the near-term roadmap." And $13 / hour is the on-demand price in US East.
dharma1 · 10 years ago
Indeed, doesn't look like it's there yet. Based on that I guess the spot prices will be around $1-3/h - not bad, if you have a workload that can be interrupted.
dman · 10 years ago
Going to comment out the deallocation bits in all my code now.
PeCaN · 10 years ago
You jest, but sometimes that's exactly what you need for short-lived programs¹. Bump alloc and free on exit is super fast if your space complexity is bounded.

¹ http://www.drdobbs.com/cpp/increasing-compiler-speed-by-over...

sedachv · 10 years ago
JonL White actually wrote a serious paper about just this idea in 1980: http://dl.acm.org/citation.cfm?id=802797
tracker1 · 10 years ago
Memory leaks be damned... Seriously, that is just huge.
kylehotchkiss · 10 years ago
Add some bitcoin mining with the power you still have afterwards
pritambarhate · 10 years ago
Question for those who have used monster servers before:

Can PostgreSQL/MySQL use such type of hardware efficiently and scale up vertically? Also can MemCached/Redis use all this RAM effectively?

I am genuinely interested in knowing this. Most of the times I work on small apps and don't have access to anything more than 16GB RAM on regular basis.

chucky_z · 10 years ago
Postgres scales great up to 256gb, at least with 9.4. After that it'll use it, but there's no real benefit. I don't know about MySQL. SQL Server scales linearly with memory even up to and past the 1TB point. I did encounter some NUMA node spanning speed issues, but numactl tuning fixed that.

I setup a handful of pgsql and Windows servers around this size. SQL Server at the time scaled better with memory. Pgsql never really got faster after a certain point, but with a lot of cores it handled tons of connections gracefully.

anarazel · 10 years ago
I've very successfully used shared buffers of 2TB, without a lot of problems. You better enable huge pages, but that's a common optimization.
alfalfasprout · 10 years ago
I don't work on 2TB+ memory servers, but one of my servers is close to 1TB of RAM.

PostgreSQL scales nicely here. Main thing you're getting is a huge disk cache. Makes repeated queries nice and fast. Still I/O bound to some extent though.

Redis will scale nicely as well. But it won't be I/O bound.

Honestly, if you really need 1TB+ it's usually going to be for numerically intensive code. This kind of code is generally written to be highly vectorizable so the hardware prefetcher will usually mask memory access latency and you get massive speedups by having your entire dataset in memory. Algorithms that can memoize heavily also benefit greatly.

adwf · 10 years ago
I've used Postgres out to the terabyte+ range with no probs, so it all works fine. Of course, whenever you approach huge data sizes like this, it tends to change how you access the data a little. eg. Do more threads equal more user connections, or more parallel computation? Generally though, databases aren't really hindered by CPU, instead by the amount of memory in the machine and this new instance is huge.

No idea about MySQL, people tend to scale that out rather than up.

jfindley · 10 years ago
For MySQL, it depends a bit what you're hoping to get out of scaling.

Scaling for performance reasons: Past a certain point, many workloads become difficult to scale due to limitations in the database process scheduler and various internals such as auto increment implementation and locking strategy. As you scale up, it's common to spend increasing percentages of your time sitting on a spinlock, with the result that diminishing returns start to kick in pretty hard.

Scaling for dataset size reasons: Still a bit complex, but generally more successful. For example, to avoid various nasty effects from having to handle IO operations on very large files, you need to start splitting your tables out into multiple files, and the sharding key for that can be hard to get right. But MySQL

In short, it's not impossible, but you need to be very careful with your schema and query design. In practice, this rarely happens because it's usually cheaper (in terms of engineering effort) to scale out rather than up.

vegancap · 10 years ago
Finally, an instance made for Java!
granos · 10 years ago
I dislike developing in Java. I am not a fanboy by any stretch of the imagination. That being said, someone who takes the time to understand how the JVM works and how to configure their processes with a proper operator's mindset can do amazing things in terms of resource usage.

It's easy to poke at Java for being a hog when in reality its just poor coding and operating practices that lead to bloated runtime behavior.

placeybordeaux · 10 years ago
For a long time I wondered if it was a failing of the language or the culture.

After spending 4 days trying to diagnose a problem with hbase given the two errors "No region found" and "No table provided" and finally figuring out it was due to a version mismatch I now believe it is the culture.

At the very least you should be printing a WARN when you connect to an incompatible version.

Kristine1975 · 10 years ago
So much this. Back in 2001 I used IntelliJ IDEA on a PC with 128MB of RAM. It worked perfectly, and it was the first IDE I used that checked my code while I was writing it. The much less evolved JBuilder on the other hand stopped every couple seconds for garbage collection.

Both were written in Java.

And don't get me started on Forte (developed by Sun itself, no less). It was even slower and more memory-hungry than JBuilder.

abraae · 10 years ago
I love Java. We shifted from c++ a year after it arrived on the scene. Since then, I've never needed to learn a new language in any depth. To me, that's a good thing and shows the longevity of the language.
yongjik · 10 years ago
> ...can do amazing things in terms of resource usage.

Sorry, but you just made my day. :P

sievebrain · 10 years ago
You jest, but think about how unbelievably painful it'd be to write a program that uses >1TB of RAM in C++ .... any bug that causes a segfault, div by zero, or really any kind of crash at all would mean you'd have to reload the entire dataset into RAM from scratch. That's gonna take a while no matter what.

You could work around it by using shared memory regions and the like but then you're doing a lot of extra work.

With a managed language and a bit of care around exception handling, you can write code that's pretty much invincible without much effort because you can't corrupt things arbitrarily.

Also, depending on the dataset in question you might find that things shrink. The latest HotSpots can deduplicate strings in memory as they garbage collect. If your dataset has a lot of repeated strings then you effectively get an interning scheme for free. I don't know if G1 can really work well with over 1TB of heap, though. I've only ever heard of it going up to a few hundred gigabytes.

Kristine1975 · 10 years ago
>With a managed language and a bit of care around exception handling, you can write code that's pretty much invincible without much effort because you can't corrupt things arbitrarily.

The JVM has crashed on me in the past (as in hard crash, not a Java exception). Less often than the C++ programs I write do? Yes, but I of course I wouldn't test a program on a 1TB dataset before ironing out all the kinks.

>The latest HotSpots can deduplicate strings in memory as they garbage collect

Obviously when working with huge datasets I would implement some kind of string deduplication myself. Most likely even a special string class an memory allocation scheme optimized for write-once, read-many access and cache friendliness.

Or I would use memory mapping for the input file and let the OS's virtual memory management sort it out.

0xfaded · 10 years ago
mmap is not "a lot of extra work".
tosseraccount · 10 years ago
Use shared memory.
scaleout1 · 10 years ago
When you suddenly realize that your "big" data is not really that big!. Who needs a Hadoop/Spark cluster when you can run one of these bad boys
tracker1 · 10 years ago
That was kind of my thought as well... I worked on a small-mid sized classifieds site (about 10-12 unique visitors a month on average) and even then the core dataset was about 8-10GB, with some log-like data hitting around 4-5GB/month. This is freakishly huge. I don't know enough about different platforms to even digest how well you can even utilize that much memory. Though it would be a first to genuinely have way more hardware than you'll likely ever need for something.

IIRC, the images for the site were closer to 7-8TB, but I don't know how typical that is for other types of sites, and caching every image on the site in memory is pretty impractical... just the same... damn.

samstave · 10 years ago
Heh, but I wonder what the default per account limits are on launching these... prolly (1) per account.
saosebastiao · 10 years ago
All I can think about is the 30 minute garbage collection pauses.
osi · 10 years ago
stcredzero · 10 years ago
Actually, as far as VMs go, the JVM is fairly spare in comparison with earlier versions of Ruby and Python -- on a per object basis. (Because of its Smalltalk roots. Yes, I had to get that in there. Drink!) That said, I've seen those horrors of cargo-cult imitation of the Gang of Four patterns, resulting in my having to instantiate 7 freaking objects to send one JMS message.

If practice in recent decades has taught us anything, it's that performance is found in intelligently using the cache. In a multi-core concurrent world, our tools should be biased towards pass by value, allocation on the stack/avoiding allocating on the heap, and avoiding chasing pointers and branching just to facilitate code organization.

EDIT: Or, as placybordeaux puts it more succinctly in a nephew comment, "VM or culture? It's the culture."

EDIT: It just occurred to me -- Programming suffers from a worship of Context-Free "Clever"!

Whether or not a particular pattern or decision is smart is highly dependent on context. (In the general sense, not the function call one.) The difficulty with programming, is that often context is very involved and hard to convey in media. As a result, a whole lot of arguments are made for or against patterns/paradigms/languages using largely context free examples.

This is why we end up in so many meaningless arguments akin to, "What is the ultimate bladed weapon?" That's simply a meaningless question, because the effectiveness of such items is very highly dependent on context. (Look up Matt Easton on YouTube.)

The analogy works in terms of the degree of fanboi nonsense.

aaronkrolik · 10 years ago
A small word of caution: I'd strongly recommend against using a huge java heap size. Java GC is stop the world, and a huge java heap size can lead to hour long gc sessions. It's much better to store data in a memory mapped file that is off heap, and access accordingly. Still very fast.
Xorlev · 10 years ago
Good advice. Even with G1GC it's hard to run heaps that large. However, not to be overly pedantic, Java GC has many different algorithms and many avoid STW collection for as long as possible and do concurrent collection until it's no longer possible. I don't think it's fair to just call it stop the world.
tracker1 · 10 years ago
I know that you are probably going to be modded into oblivion, but can Java address this much memory in a single application? I'm genuinely curious, as I would assume, depending on the OS that you'd have to run several (many) processes in order to even address that much ram effectively.

Still really cool to see something like this, I didn't even know you could get close to 2TB of ram in a single server at any kind of scale.

fulafel · 10 years ago
Bigger iron has been at 64-512 TB for a while:

http://www.cray.com/blog/the-power-of-512-terabytes-of-share...

http://www.enterprisetech.com/2014/10/06/ibm-takes-big-workl...

Or significantly higher if you don't restrict yourself to single-system-image, shared memory machines - there are at least 2 1300-1500 TB systems on the Top 500 list.

wmfiv · 10 years ago
Not using the out of the box solutions. But while I haven't done this personally my understanding is Azul Zing will allow you to efficiently use multi TB heaps in Java.
astral303 · 10 years ago
Java can address 32GB heaps with compressedoops flag enabled. After that flag is off, you can address as much as 64 bits will allow. http://stackoverflow.com/questions/2093679/max-memory-for-64...

Do a little research before implying that there's no way that Java can address gigantic heaps.

0xmohit · 10 years ago
and Scala too.

Scala _beats_ Java in most of the benchmarks: http://benchmarksgame.alioth.debian.org/u64q/scala.html

igouy · 10 years ago
> _beats_ Java

Not according to that data!