Subject: Discussion related to cado-nfs
List archive
- From: Ed Hall <ed_ka2fwj@yahoo.com>
- To: "cado-nfs@inria.fr" <cado-nfs@inria.fr>
- Subject: [cado-nfs] Unexpected Memory Use Change
- Date: Tue, 6 Apr 2021 11:23:00 +0000 (UTC)
- Authentication-results: mail3-smtp-sop.national.inria.fr; spf=None smtp.pra=ed_ka2fwj@yahoo.com; spf=Pass smtp.mailfrom=ed_ka2fwj@yahoo.com; spf=None smtp.helo=postmaster@sonic306-1.consmr.mail.bf2.yahoo.com
- Ironport-hdrordr: A9a23:XZkOmqzpy3+j3LE1E8XMKrPwhL1zdoIgy1knxilNYDZSddGVkN3rufgd2wL04QxhIU0Is9aGJaWGXDfg5Yd4iLNhWYuKcSvHnC+TLI9k5Zb/2DGIIUbD38NUyKsIScZDIfLqC1wSt6fHyS2ZN/pl/9Wd6qCvgo7lvhJQZCVncbtp4Qs8KivzKDwUeCB8CZA0FIWR66N8zlLORV0tYsu2HXUDVeTYzue7867OWwIMBBIs9WC1/FGVwYP9eiL54j4uFxdGwbIv6gH+4mrEz5Tmided6jWZ+k3yy9BtuPXHo+EoOOW8zvISLTXnziKGDb4MZ5Sy+AAQqOGrrHARvrD3zisdAw==
- Ironport-phdr: A9a23:G4wYFxXBgTn7w3QxXfFLloFsVN3V8KxmVTF92vIco4ILSbyq+tHYBGea288FpGHAUYiT0f9Yke2e6/mmBTVRp8/e6zteK9RlbFwssY0uhQsuAcqIWwXQDcXBSGgEJvlET0Jv5HqhMEJYS47UblzWpWCuv3ZJQk2sfQV6Kf7oFYHMks+5y/69+4HJYwVPmTGxfa5+IA+5oAnMssQam5ZuJ6Q/xxfGrXZFdfldyH91K16Ugxvy/Nq78oR58yRXtfIh9spAXrv/cq8lU7FWDykoPn4s6sHzuhbNUQWA5n0HUmULiRVIGBTK7Av7XpjqrCT3sPd21TSAMs33SbA0Ximi77tuRRT1hioLKyI1/WfKgcF2kalVog+upwZnzoDaYI+bKvlwcL7SctwGS2pPWd1cWDZdDoyndIQCFfYNMOReooLgp1UOtxy+BQy0Ce3g1zBDm3340rc+0+QlDArL2xIvEM8Wv3TXttr5KqkSUO63zKTTzTTMdfNW2TD66ITSbh8hpvSMUKt2fMHMxkYhCxnLgU+MqYz5ITyVzOINvnCa4ud+Se+hi2EqpgBtrzWgyckhi5TEi4ALxl3K9Sh3zoU4K923RUN6b9OpHpRduz+HO4V2QM4vTH9ktDo+x7AHv5OwYSYEyJMixxHFavyHdZCF4hf5W+aNITd1gHZodKi7hxa17Uev0On8WtGo0FlUtSpFjsPAt34K1xzJ6ciKTOZ28ES52TuX1Q3e5ftILEEumabGJZMt3KQ8m5UPvUnFAyT4gl/5jLWMeUUh4uWo6/roYrHhppKELI90jQf+MqUylcGxHeg1MxECU3WB9eug073j+1b5QLBQjvEsl6nWqpHaJcABqqGiDQ9ZzJwv6halADem19QYmmMLI05CeBKCl4TpOlfOL+7kDfqngVmhny1nyvHcMrH8DJjBMGLPnbj9cbpl7k5T0gszzdRR55JODbEBJer+VVHsu9PADR82KRK4w+jpCdV/zY4fWXiAAq+eMKPVq1OH+uUvI+yUaI8PpDn9M+Ql5+LpjXIhhVASZ7Sm3ZwOZHC+EPRmOF6UYWHsg9cECWcFpBAyTO3siF2YUD5cfWy+X6wm5mJzNIXzR4zKSofohrKa9Ca9BJxfIG5cQBjYGn7kcMCCVewkaSSII8YnnCZSBpa7TIp0+B3mlAb+z/IzI+bj+SRCu43h/Ml85+rU0xo18GonXIymz2iRQjQszSszTDgs0fUnyWRNj2yb2K09uMR2UNxe4/Qhehw9M5/XlLQkTom0UQXHZdKTDlOvQ9HgCz9rCMM4w9gJJU16Hof65jjzmhGyCrpQrISlQYQu+8r0znHxIME7wHHDhvFJp2ljedNGMCidvoA68gHSA4DTlEDxv7ejdaMbmiXK8TXapVc=
Dear Team Members,
I'm having a memory issue with the latest commit. I have nearly my whole farm running an earlier commit that uses memory based on client processes, i.e. one process uses ~4GB for the current job regardless of the number of threads. With the latest commit, the memory use appears to be related to the number of threads.
I haven't tried intermediate commits, but here are some samples of the earlier one I was using and then the latest I tried.
Expected behavior from before:
=============================================================
commit 8ab2eea7202ffaf62825057c50f57359be244c29 (HEAD)
Merge: cfc03e7e4 7c10b68f0
Author: Emmanuel Thom� <emmanuel.thome@inria.fr>
Date: Wed Oct 21 10:29:52 2020 +0200
-------client using one process with 2 threads-------
./cado-nfs-client.py --clientid=test --single --bindir=build/$HOSTNAME/ --server=http://math99.local:13573
INFO:root:Running build/math89/sieve/las -poly download/hcn10+3_930.poly -q0 533124000 -I 15 -q1 533126000 -lim0 536000000 -lim1 536000000 -lpb0 32 -lpb1 32 -mfb0 64 -mfb1 94 -ncurves0 20 -ncurves1 20 -fb1 download/hcn10+3_930.roots1.gz -out test.work/hcn10+3_930.533124000-533126000.gz -t 2 -sqside 1 -adjust-strategy 2 -stats-stderr
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
15157 math89 20 0 4148676 3.7g 7292 S 200.3 24.0 16:09.45 las
----------------------------------------------------------------
-------client using one process with 6 threads-------
./cado-nfs-client.py --clientid=test --single --bindir=build/$HOSTNAME/ --override t 6 --server=http://math99.local:13573
INFO:root:Running build/math89/sieve/las -poly download/hcn10+3_930.poly -q0 533366000 -I 15 -q1 533368000 -lim0 536000000 -lim1 536000000 -lpb0 32 -lpb1 32 -mfb0 64 -mfb1 94 -ncurves0 20 -ncurves1 20 -fb1 download/hcn10+3_930.roots1.gz -out test.work/hcn10+3_930.533366000-533368000.gz -t 6 -sqside 1 -adjust-strategy 2 -stats-stderr
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
15191 math89 20 0 4123972 3.6g 7200 S 551.0 23.3 16:04.17 las
=============================================================
Behavior now:
=============================================================
commit 1d08d4325615139dbe007b43b31e734c4b7dec8d (HEAD -> master, origin/master, origin/HEAD)
Merge: 60f1fe069 0b8ff834d
Author: Emmanuel Thom� <emmanuel.thome@inria.fr>
Date: Sun Apr 4 19:55:46 2021 +0000
Merge branch 'lean-ci' into 'master'
simplify ci structure, get rid of intermediary containers.
See merge request cado-nfs/cado-nfs!30
-------client using one process with 2 threads-------
./cado-nfs-client.py --clientid=test --single --bindir=build/$HOSTNAME/ --server=http://math99.local:13573
INFO:root:Running build/math89/sieve/las -poly download/hcn10+3_930.poly -q0 534054000 -I 15 -q1 534056000 -lim0 536000000 -lim1 536000000 -lpb0 32 -lpb1 32 -mfb0 64 -mfb1 94 -ncurves0 20 -ncurves1 20 -fb1 download/hcn10+3_930.roots1.gz -out test.work/hcn10+3_930.534054000-534056000.gz -t 2 -sqside 1 -adjust-strategy 2 -stats-stderr
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
38473 math89 20 0 7884416 7.3g 7432 S 200.3 47.1 16:06.76 las
----------------------------------------------------------------
-------client using one process with 6 threads-------
./cado-nfs-client.py --clientid=test --single --bindir=build/$HOSTNAME/ --override t 6 --server=http://math99.local:13573
INFO:root:Running build/math89/sieve/las -poly download/hcn10+3_930.poly -q0 534556000 -I 15 -q1 534558000 -lim0 536000000 -lim1 536000000 -lpb0 32 -lpb1 32 -mfb0 64 -mfb1 94 -ncurves0 20 -ncurves1 20 -fb1 download/hcn10+3_930.roots1.gz -out test.work/hcn10+3_930.534556000-534558000.gz -t 6 -sqside 1 -adjust-strategy 2 -stats-stderr
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
38668 math89 20 0 12.8g 12.5g 7308 S 587.4 80.1 16:05.73 las
==========================================================================
If I use the full 12 cores without the additional 12 threads, it brings the machine to a stand-still and I have to ssh a "pkill las" and wait a few minutes for it to be honored.
Thanks for all your work.
Sincerely,
Edwin Hall
I'm having a memory issue with the latest commit. I have nearly my whole farm running an earlier commit that uses memory based on client processes, i.e. one process uses ~4GB for the current job regardless of the number of threads. With the latest commit, the memory use appears to be related to the number of threads.
I haven't tried intermediate commits, but here are some samples of the earlier one I was using and then the latest I tried.
Expected behavior from before:
=============================================================
commit 8ab2eea7202ffaf62825057c50f57359be244c29 (HEAD)
Merge: cfc03e7e4 7c10b68f0
Author: Emmanuel Thom� <emmanuel.thome@inria.fr>
Date: Wed Oct 21 10:29:52 2020 +0200
-------client using one process with 2 threads-------
./cado-nfs-client.py --clientid=test --single --bindir=build/$HOSTNAME/ --server=http://math99.local:13573
INFO:root:Running build/math89/sieve/las -poly download/hcn10+3_930.poly -q0 533124000 -I 15 -q1 533126000 -lim0 536000000 -lim1 536000000 -lpb0 32 -lpb1 32 -mfb0 64 -mfb1 94 -ncurves0 20 -ncurves1 20 -fb1 download/hcn10+3_930.roots1.gz -out test.work/hcn10+3_930.533124000-533126000.gz -t 2 -sqside 1 -adjust-strategy 2 -stats-stderr
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
15157 math89 20 0 4148676 3.7g 7292 S 200.3 24.0 16:09.45 las
----------------------------------------------------------------
-------client using one process with 6 threads-------
./cado-nfs-client.py --clientid=test --single --bindir=build/$HOSTNAME/ --override t 6 --server=http://math99.local:13573
INFO:root:Running build/math89/sieve/las -poly download/hcn10+3_930.poly -q0 533366000 -I 15 -q1 533368000 -lim0 536000000 -lim1 536000000 -lpb0 32 -lpb1 32 -mfb0 64 -mfb1 94 -ncurves0 20 -ncurves1 20 -fb1 download/hcn10+3_930.roots1.gz -out test.work/hcn10+3_930.533366000-533368000.gz -t 6 -sqside 1 -adjust-strategy 2 -stats-stderr
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
15191 math89 20 0 4123972 3.6g 7200 S 551.0 23.3 16:04.17 las
=============================================================
Behavior now:
=============================================================
commit 1d08d4325615139dbe007b43b31e734c4b7dec8d (HEAD -> master, origin/master, origin/HEAD)
Merge: 60f1fe069 0b8ff834d
Author: Emmanuel Thom� <emmanuel.thome@inria.fr>
Date: Sun Apr 4 19:55:46 2021 +0000
Merge branch 'lean-ci' into 'master'
simplify ci structure, get rid of intermediary containers.
See merge request cado-nfs/cado-nfs!30
-------client using one process with 2 threads-------
./cado-nfs-client.py --clientid=test --single --bindir=build/$HOSTNAME/ --server=http://math99.local:13573
INFO:root:Running build/math89/sieve/las -poly download/hcn10+3_930.poly -q0 534054000 -I 15 -q1 534056000 -lim0 536000000 -lim1 536000000 -lpb0 32 -lpb1 32 -mfb0 64 -mfb1 94 -ncurves0 20 -ncurves1 20 -fb1 download/hcn10+3_930.roots1.gz -out test.work/hcn10+3_930.534054000-534056000.gz -t 2 -sqside 1 -adjust-strategy 2 -stats-stderr
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
38473 math89 20 0 7884416 7.3g 7432 S 200.3 47.1 16:06.76 las
----------------------------------------------------------------
-------client using one process with 6 threads-------
./cado-nfs-client.py --clientid=test --single --bindir=build/$HOSTNAME/ --override t 6 --server=http://math99.local:13573
INFO:root:Running build/math89/sieve/las -poly download/hcn10+3_930.poly -q0 534556000 -I 15 -q1 534558000 -lim0 536000000 -lim1 536000000 -lpb0 32 -lpb1 32 -mfb0 64 -mfb1 94 -ncurves0 20 -ncurves1 20 -fb1 download/hcn10+3_930.roots1.gz -out test.work/hcn10+3_930.534556000-534558000.gz -t 6 -sqside 1 -adjust-strategy 2 -stats-stderr
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
38668 math89 20 0 12.8g 12.5g 7308 S 587.4 80.1 16:05.73 las
==========================================================================
If I use the full 12 cores without the additional 12 threads, it brings the machine to a stand-still and I have to ssh a "pkill las" and wait a few minutes for it to be honored.
Thanks for all your work.
Sincerely,
Edwin Hall
- [cado-nfs] Unexpected Memory Use Change, Ed Hall, 04/06/2021
- Re: [cado-nfs] Unexpected Memory Use Change, Emmanuel Thomé, 04/06/2021
- Re: [cado-nfs] Unexpected Memory Use Change, Ed Hall, 04/06/2021
- Re: [cado-nfs] Unexpected Memory Use Change, Emmanuel Thomé, 04/06/2021
- Re: [cado-nfs] Unexpected Memory Use Change, Ed Hall, 04/06/2021
- Re: [cado-nfs] Unexpected Memory Use Change, Paul Zimmermann, 04/06/2021
- Re: [cado-nfs] Unexpected Memory Use Change, Emmanuel Thomé, 04/06/2021
- Re: [cado-nfs] Unexpected Memory Use Change, Ed Hall, 04/06/2021
- Re: [cado-nfs] Unexpected Memory Use Change, Ed Hall, 04/06/2021
- Re: [cado-nfs] Unexpected Memory Use Change, Ed Hall, 04/06/2021
- Re: [cado-nfs] Unexpected Memory Use Change, Emmanuel Thomé, 04/06/2021
- Re: [cado-nfs] Unexpected Memory Use Change, Ed Hall, 04/06/2021
- Re: [cado-nfs] Unexpected Memory Use Change, Ed Hall, 04/06/2021
- Re: [cado-nfs] Unexpected Memory Use Change, Emmanuel Thomé, 04/06/2021
Archive powered by MHonArc 2.6.19+.