walki (u/walki) - Readit News

walki commented on Windows NT vs. Unix: A design comparison blogsystem5.substack.com/... · Posted by u/LorenDB

walki · 2 years ago

I feel like the NT kernel is in maintenance only mode and will eventually be replaced by the Linux kernel. I submitted a Windows kernel bug to Microsoft a few years ago and even though they acknowledged the bug the issue was closed as a "won't fix" because fixing the bug would require making backwards incompatible changes.

Windows currently has a significant scaling issue because of its Processor Groups design, it is actually more of an ugly hack that was added to Windows 7 to support more than 64 threads. Everyone makes bad decisions when developing a kernel, the difference between the Windows NT kernel and the Linux kernel is that fundamental design flaws tend to get eventually fixed in the Linux kernel while they rarely get fixed in the Windows NT kernel.

walki commented on Bitwarden Heist – How to break into password vaults without using passwords blog.redteam-pentesting.d... · Posted by u/RedTeamPT

walki · 2 years ago

Microsoft's %Appdata% directory is a security nightmare in my opinion. Ideally applications should only have access to their own directories in %Appdata% by default. I recently came across a python script on GitHub that allows to decrypt passwords the browser stores locally in their %Appdata% directory. Many attacks could be prevented if access to %Appdata% was more restricted.

I also found a post of an admin a few days ago where he asked if there was a Windows setting for disallowing any access to %Appdata%. The response was that if access to %Appdata% is completely blocked Windows won't work anymore.

walki commented on Endianness, a constant source of conflict for decades technicalsourcery.net/pos... · Posted by u/sgt

iainmerrick · 5 years ago

For performance, can’t you “just” have a swap-endianness instruction in your CPU, and have the compiler use it when it detects byte-shuffling code?

(That may even happen already on some architectures for all I know)

walki · 5 years ago

> For performance, can’t you “just” have a swap-endianness instruction in your CPU

Yes, most CPUs have special instructions for swapping between little and big endian byte arrangement. The GCC compiler has the __builtin_bswap64(x) for accessing this instruction. However this is an additional instruction that needs to be executed for each read of a 64-bit word that needs to be converted, in some workloads this can double the number of executed instructions and hence add significant overhead.

Supporting big endian CPUs in systems programming sucks beyond imagination. There are virtually no big endian users anymore and making sure your software works fine on big endian requires testing it on a big endian CPU. However it is not possible to buy a big endian CPU anymore as there exist no more consumer big endian CPUs. For this reason I still have a Mac PowerPC from 2003 at home running an ancient version of Mac OS X. But over the last 2 years I have stopped testing my software on big endian, I just don't care about big endian anymore...

walki commented on Endianness, a constant source of conflict for decades technicalsourcery.net/pos... · Posted by u/sgt

flohofwoe · 5 years ago

> as it forces you to read things byte-by-byte

Indeed, reading file headers byte by byte also avoids alignment issues on some CPUs. At least older ARM CPUs trapped misaligned reads (not sure if this is still the case though).

walki · 5 years ago

> not sure if this is still the case though

No this is not the case anymore. Nowadays support for unaligned memory accesses is very good on ARM and most other CPU architectures. On x86 aligned memory used to be very important for SIMD but now there are even special SIMD instructions for unaligned data and the performance overhead of unaligned memory accesses is generally very small in my experience.

walki commented on Endianness, a constant source of conflict for decades technicalsourcery.net/pos... · Posted by u/sgt

iainmerrick · 5 years ago

Apart from most file formats and internet standards being big-endian, you mean?

Although to borrow from minusf’s point, it’s good for software robustness that file formats and hardware use different endianness, as it forces you to read things byte-by-byte rather than lazily assuming you can just read 4 bytes and cast them directly to an int32.

walki · 5 years ago

> it’s good for software robustness that file formats and hardware use different endianness, as it forces you to read things byte-by-byte rather than lazily assuming you can just read 4 bytes and cast them directly to an int32.

Except that it is very bad for performance. As far as CPUs are concerned little-endian has definitely won, most CPU architectures that have been big endian in the past (e.g. PowerPC) are now little endian by default.

If all new CPU architectures are little endian this means that within a decade or two there won't be any operating systems that support big endian anymore.

walki commented on Rust is now overall faster than C in benchmarks benchmarksgame-team.pages... · Posted by u/wiineeth

walki · 5 years ago

OK thanks, indeed Clang is able to generate better assembly using __restrict__. And -O3 generates the same assembly as -O3 -fstrict-aliasing (which is not as good as __restrict__).

I wish there was a C/C++ compiler flag for treating all pointers as __restrict__. However I guess that C/C++ standard libraries wouldn't work with this compiler option (and therefore this compiler option wouldn't be useful in practice).

walki · 5 years ago

What's interesting to note though is that I tried marking pointers __restrict__ in the performance critical sections in 2 of my C++ projects and the assembly generated by Clang was identical in all cases!

So while it is true that by default Rust has a theoretical performance advantage (compared to C/C++) because it forbids aliasing pointers I wonder (doubt) whether this will cause Rust binaries to generally run faster than C/C++ binaries.

On the other hand Rust puts security first, so there are lots of array bounds checks in Rust programs (and not all of these bounds checks can be eliminated). Personally I think this feature deteriorates the performance of Rust programs much more than you gain by forbidding aliasing pointers.

walki commented on Rust is now overall faster than C in benchmarks benchmarksgame-team.pages... · Posted by u/wiineeth

murderfs · 5 years ago

restrict and strict aliasing have to do with the same general concept, but aren't the same. They both have to do with allowing the compiler to optimize around assuming that writes to one pointer won't be visible while reading from another. As a concrete example, can the following branches be merged?

  void foo(/*restrict*/ bool* x, int* y) {
    if (*x) {
      printf("foo\n");
      *y = 0;
    }
    if (*x) {
      printf("bar\n");
    }
  }

Enabling strict aliasing is effectively an assertion that pointers of incompatible types will never point to the same data, so a write to y will never touch *x. restrict is an assertion to the compiler on that specific pointer that no other pointer aliases to it.

walki · 5 years ago

OK thanks, indeed Clang is able to generate better assembly using __restrict__. And -O3 generates the same assembly as -O3 -fstrict-aliasing (which is not as good as __restrict__).

I wish there was a C/C++ compiler flag for treating all pointers as __restrict__. However I guess that C/C++ standard libraries wouldn't work with this compiler option (and therefore this compiler option wouldn't be useful in practice).

walki commented on Rust is now overall faster than C in benchmarks benchmarksgame-team.pages... · Posted by u/wiineeth

arcticbull · 5 years ago

Rust can be faster than C because in general C compilers have to assume that pointers to memory locations can overlap (unless you mark them __restrict). Rust forbids aliasing pointers. This opens up a whole world of optimizations in the Rust compiler. Broadly speaking this is why Rust can genuinely be faster than C. Same is true in FORTRAN, for what it's worth.

walki · 5 years ago

> C compilers have to assume that pointers to memory locations can overlap, unless you mark them __restrict...

What I don't fully understand is: "GCC has the option -fstrict-aliasing which enables aliasing optimizations globally and expects you to ensure that nothing gets illegally aliased. This optimization is enabled for -O2 and -O3 I believe." (source: https://stackoverflow.com/a/7298596)

Doesn't this mean that C++ programs compiled in release mode behave as if all pointers are marked with __restrict?

walki commented on The Ampere Altra Review: 2x 80 Cores Arm Server Performance Monster anandtech.com/show/16315/... · Posted by u/sien

panpanna · 5 years ago

Has anyone tried machine learning on this or the Graviton2?

I understand TPUs and GPUs easily beat these guys but it would still be interesting to see what raw cpu power can achieve in 2020.

walki · 5 years ago

> Has anyone tried machine learning on this or the Graviton2?

I have not done any machine learning on AWS Graviton2 CPUs but I ran many other CPU benchmarks on Graviton2 CPUs and overall I have been disappointed by their performance. They are still much slower than current x64 CPUs (x64 CPUs are up to 2x faster in single thread mode).

According to the benchmarks from Anandtech the Ampere Altra should have much better performance than Graviton2 CPUs as its performance is neck to neck with the fastest x64 CPUs.

walki commented on Dear Google Cloud: Your Deprecation Policy Is Killing You medium.com/@steve.yegge/d... · Posted by u/bigiain

merb · 6 years ago

that was sily because they only asked for a company name but you could've entered "Individual" inside the box and they would not care!

walki · 6 years ago

They also asked for the VAT number of the company...