Thanks, you defined my next workstep. Even my 16C/32T AMD 7950X PC should make that much faster since the cores run with 4.6GHz compared to 2.3GHz of the (currently 72 cores) in 4 socket system.Is most of the time taking square roots? From what I can tell computing D is one of those embarrassingly parallel operations that scales linearly. Thus, 144 cores should take
15*60/144 = 6.25 seconds
to recompute D. This would also provide some practical parallel experience with the 3950.
During the 9h(!) long computation for determining 171.87 GFLOPS of the 8 socket system ...What is the maximum power you've seen the 8-socket system use under load?
https://gist.github.com/Hermann-SW/c9bf ... c15d429743
... the system did draw 2.1KW(!) the whole time.
Please create a new Raspberry thread "Pi cluser with numa nodes" on this, outside of "Off topic discussion". Here is what I found out sofar as a starter:Is it possible to develop the code with a Pi?
- 2018/2019 3part series of building a Pi cluster ...
- ... using slurp
https://en.wikipedia.org/wiki/Slurm_Workload_Manager - I searched for "numa slurp" and found this:
https://slurm.schedmd.com/mc_support.html#defs - since I want to develop with pthread lib, the question is whether the numactl and other numa lib functionality can be build on Pi cluster (slurp is for processes)
- Fast interconnect (QPI) information connecting the sockets of my server:
https://pubs.lenovo.com/x3850-x6-6241/P ... df#page=20
P.S:
I tried to buy cheap used ECC RAM modules to complete 4 RAM modules for all 8 sockets. During the process of buying from multiple private persons/businesses sixteen 2Rx4 PC4-2133P 16GB ECC Ram modules I had mixture of the SAMSUNG model I have, and cheaper Micron models. Because the twelve 16GB 1Rx4 Ram modules I use in my dual socket e5-2680v4 server are not supported according product documentation (of both multi socket servers), I decided to use the 2Rx4 Micron modules for that server. Then I added orders for new 16 Micron 16GB modules and 16 Samsung 16GB modules in total. Because I won some auctions really cheap (four 16GB modules for 25.50€ in total and another four for 30€ in total), the average prices inclusive shipping was only 12.65€ per 16GB module for all 32 new modules. There is no risk with ebay buyer protection and testing modules when they will arrive. Either they work as proposed, or I get money back. Since the Samsung/Micron modules are both 2Rx4 PC4-2133P mixing should be possible. So besides the 512GB Samsung 8 socket server and 256GB Micron 2 socket server, both with 4 RAM modules per channel, now 12 modules per socket of the current 4 socket system becomes possible. That would be 192GB per each of the 4 sockets ...
Statistics: Posted by HermannSW — Tue Aug 19, 2025 3:58 pm