I also found a post of an admin a few days ago where he asked if there was a Windows setting for disallowing any access to %Appdata%. The response was that if access to %Appdata% is completely blocked Windows won't work anymore.
I also found a post of an admin a few days ago where he asked if there was a Windows setting for disallowing any access to %Appdata%. The response was that if access to %Appdata% is completely blocked Windows won't work anymore.
(That may even happen already on some architectures for all I know)
Yes, most CPUs have special instructions for swapping between little and big endian byte arrangement. The GCC compiler has the __builtin_bswap64(x) for accessing this instruction. However this is an additional instruction that needs to be executed for each read of a 64-bit word that needs to be converted, in some workloads this can double the number of executed instructions and hence add significant overhead.
Supporting big endian CPUs in systems programming sucks beyond imagination. There are virtually no big endian users anymore and making sure your software works fine on big endian requires testing it on a big endian CPU. However it is not possible to buy a big endian CPU anymore as there exist no more consumer big endian CPUs. For this reason I still have a Mac PowerPC from 2003 at home running an ancient version of Mac OS X. But over the last 2 years I have stopped testing my software on big endian, I just don't care about big endian anymore...
Indeed, reading file headers byte by byte also avoids alignment issues on some CPUs. At least older ARM CPUs trapped misaligned reads (not sure if this is still the case though).
No this is not the case anymore. Nowadays support for unaligned memory accesses is very good on ARM and most other CPU architectures. On x86 aligned memory used to be very important for SIMD but now there are even special SIMD instructions for unaligned data and the performance overhead of unaligned memory accesses is generally very small in my experience.
Although to borrow from minusf’s point, it’s good for software robustness that file formats and hardware use different endianness, as it forces you to read things byte-by-byte rather than lazily assuming you can just read 4 bytes and cast them directly to an int32.
Except that it is very bad for performance. As far as CPUs are concerned little-endian has definitely won, most CPU architectures that have been big endian in the past (e.g. PowerPC) are now little endian by default.
If all new CPU architectures are little endian this means that within a decade or two there won't be any operating systems that support big endian anymore.
I wish there was a C/C++ compiler flag for treating all pointers as __restrict__. However I guess that C/C++ standard libraries wouldn't work with this compiler option (and therefore this compiler option wouldn't be useful in practice).
So while it is true that by default Rust has a theoretical performance advantage (compared to C/C++) because it forbids aliasing pointers I wonder (doubt) whether this will cause Rust binaries to generally run faster than C/C++ binaries.
On the other hand Rust puts security first, so there are lots of array bounds checks in Rust programs (and not all of these bounds checks can be eliminated). Personally I think this feature deteriorates the performance of Rust programs much more than you gain by forbidding aliasing pointers.
void foo(/*restrict*/ bool* x, int* y) {
if (*x) {
printf("foo\n");
*y = 0;
}
if (*x) {
printf("bar\n");
}
}
Enabling strict aliasing is effectively an assertion that pointers of incompatible types will never point to the same data, so a write to y will never touch *x. restrict is an assertion to the compiler on that specific pointer that no other pointer aliases to it.I wish there was a C/C++ compiler flag for treating all pointers as __restrict__. However I guess that C/C++ standard libraries wouldn't work with this compiler option (and therefore this compiler option wouldn't be useful in practice).
What I don't fully understand is: "GCC has the option -fstrict-aliasing which enables aliasing optimizations globally and expects you to ensure that nothing gets illegally aliased. This optimization is enabled for -O2 and -O3 I believe." (source: https://stackoverflow.com/a/7298596)
Doesn't this mean that C++ programs compiled in release mode behave as if all pointers are marked with __restrict?
I understand TPUs and GPUs easily beat these guys but it would still be interesting to see what raw cpu power can achieve in 2020.
I have not done any machine learning on AWS Graviton2 CPUs but I ran many other CPU benchmarks on Graviton2 CPUs and overall I have been disappointed by their performance. They are still much slower than current x64 CPUs (x64 CPUs are up to 2x faster in single thread mode).
According to the benchmarks from Anandtech the Ampere Altra should have much better performance than Graviton2 CPUs as its performance is neck to neck with the fastest x64 CPUs.
Windows currently has a significant scaling issue because of its Processor Groups design, it is actually more of an ugly hack that was added to Windows 7 to support more than 64 threads. Everyone makes bad decisions when developing a kernel, the difference between the Windows NT kernel and the Linux kernel is that fundamental design flaws tend to get eventually fixed in the Linux kernel while they rarely get fixed in the Windows NT kernel.