Skip to Content.
Sympa Menu

cado-nfs - [cado-nfs] Unexpected Memory Use Change

Subject: Discussion related to cado-nfs

List archive

[cado-nfs] Unexpected Memory Use Change


Chronological Thread 
  • From: Ed Hall <ed_ka2fwj@yahoo.com>
  • To: "cado-nfs@inria.fr" <cado-nfs@inria.fr>
  • Subject: [cado-nfs] Unexpected Memory Use Change
  • Date: Tue, 6 Apr 2021 11:23:00 +0000 (UTC)
  • Authentication-results: mail3-smtp-sop.national.inria.fr; spf=None smtp.pra=ed_ka2fwj@yahoo.com; spf=Pass smtp.mailfrom=ed_ka2fwj@yahoo.com; spf=None smtp.helo=postmaster@sonic306-1.consmr.mail.bf2.yahoo.com
  • Ironport-hdrordr: A9a23:XZkOmqzpy3+j3LE1E8XMKrPwhL1zdoIgy1knxilNYDZSddGVkN3rufgd2wL04QxhIU0Is9aGJaWGXDfg5Yd4iLNhWYuKcSvHnC+TLI9k5Zb/2DGIIUbD38NUyKsIScZDIfLqC1wSt6fHyS2ZN/pl/9Wd6qCvgo7lvhJQZCVncbtp4Qs8KivzKDwUeCB8CZA0FIWR66N8zlLORV0tYsu2HXUDVeTYzue7867OWwIMBBIs9WC1/FGVwYP9eiL54j4uFxdGwbIv6gH+4mrEz5Tmided6jWZ+k3yy9BtuPXHo+EoOOW8zvISLTXnziKGDb4MZ5Sy+AAQqOGrrHARvrD3zisdAw==
  • Ironport-phdr: A9a23:G4wYFxXBgTn7w3QxXfFLloFsVN3V8KxmVTF92vIco4ILSbyq+tHYBGea288FpGHAUYiT0f9Yke2e6/mmBTVRp8/e6zteK9RlbFwssY0uhQsuAcqIWwXQDcXBSGgEJvlET0Jv5HqhMEJYS47UblzWpWCuv3ZJQk2sfQV6Kf7oFYHMks+5y/69+4HJYwVPmTGxfa5+IA+5oAnMssQam5ZuJ6Q/xxfGrXZFdfldyH91K16Ugxvy/Nq78oR58yRXtfIh9spAXrv/cq8lU7FWDykoPn4s6sHzuhbNUQWA5n0HUmULiRVIGBTK7Av7XpjqrCT3sPd21TSAMs33SbA0Ximi77tuRRT1hioLKyI1/WfKgcF2kalVog+upwZnzoDaYI+bKvlwcL7SctwGS2pPWd1cWDZdDoyndIQCFfYNMOReooLgp1UOtxy+BQy0Ce3g1zBDm3340rc+0+QlDArL2xIvEM8Wv3TXttr5KqkSUO63zKTTzTTMdfNW2TD66ITSbh8hpvSMUKt2fMHMxkYhCxnLgU+MqYz5ITyVzOINvnCa4ud+Se+hi2EqpgBtrzWgyckhi5TEi4ALxl3K9Sh3zoU4K923RUN6b9OpHpRduz+HO4V2QM4vTH9ktDo+x7AHv5OwYSYEyJMixxHFavyHdZCF4hf5W+aNITd1gHZodKi7hxa17Uev0On8WtGo0FlUtSpFjsPAt34K1xzJ6ciKTOZ28ES52TuX1Q3e5ftILEEumabGJZMt3KQ8m5UPvUnFAyT4gl/5jLWMeUUh4uWo6/roYrHhppKELI90jQf+MqUylcGxHeg1MxECU3WB9eug073j+1b5QLBQjvEsl6nWqpHaJcABqqGiDQ9ZzJwv6halADem19QYmmMLI05CeBKCl4TpOlfOL+7kDfqngVmhny1nyvHcMrH8DJjBMGLPnbj9cbpl7k5T0gszzdRR55JODbEBJer+VVHsu9PADR82KRK4w+jpCdV/zY4fWXiAAq+eMKPVq1OH+uUvI+yUaI8PpDn9M+Ql5+LpjXIhhVASZ7Sm3ZwOZHC+EPRmOF6UYWHsg9cECWcFpBAyTO3siF2YUD5cfWy+X6wm5mJzNIXzR4zKSofohrKa9Ca9BJxfIG5cQBjYGn7kcMCCVewkaSSII8YnnCZSBpa7TIp0+B3mlAb+z/IzI+bj+SRCu43h/Ml85+rU0xo18GonXIymz2iRQjQszSszTDgs0fUnyWRNj2yb2K09uMR2UNxe4/Qhehw9M5/XlLQkTom0UQXHZdKTDlOvQ9HgCz9rCMM4w9gJJU16Hof65jjzmhGyCrpQrISlQYQu+8r0znHxIME7wHHDhvFJp2ljedNGMCidvoA68gHSA4DTlEDxv7ejdaMbmiXK8TXapVc=

Dear Team Members,

I'm having a memory issue with the latest commit.  I have nearly my whole farm running an earlier commit that uses memory based on client processes, i.e. one process uses ~4GB for the current job regardless of the number of threads.  With the latest commit, the memory use appears to be related to the number of threads.

I haven't tried intermediate commits, but here are some samples of the earlier one I was using and then the latest I tried.

Expected behavior from before:
=============================================================
commit 8ab2eea7202ffaf62825057c50f57359be244c29 (HEAD)
Merge: cfc03e7e4 7c10b68f0
Author: Emmanuel Thom� <emmanuel.thome@inria.fr>
Date:   Wed Oct 21 10:29:52 2020 +0200

-------client using one process with 2 threads-------
./cado-nfs-client.py --clientid=test --single --bindir=build/$HOSTNAME/ --server=http://math99.local:13573

INFO:root:Running build/math89/sieve/las -poly download/hcn10+3_930.poly -q0 533124000 -I 15 -q1 533126000 -lim0 536000000 -lim1 536000000 -lpb0 32 -lpb1 32 -mfb0 64 -mfb1 94 -ncurves0 20 -ncurves1 20 -fb1 download/hcn10+3_930.roots1.gz -out test.work/hcn10+3_930.533124000-533126000.gz -t 2 -sqside 1 -adjust-strategy 2 -stats-stderr

    PID USER      PR  NI    VIRT    RES    SHR S  %CPU  %MEM     TIME+ COMMAND  
  15157 math89    20   0 4148676   3.7g   7292 S 200.3  24.0  16:09.45 las
----------------------------------------------------------------
-------client using one process with 6 threads-------
./cado-nfs-client.py --clientid=test --single --bindir=build/$HOSTNAME/ --override t 6 --server=http://math99.local:13573

INFO:root:Running build/math89/sieve/las -poly download/hcn10+3_930.poly -q0 533366000 -I 15 -q1 533368000 -lim0 536000000 -lim1 536000000 -lpb0 32 -lpb1 32 -mfb0 64 -mfb1 94 -ncurves0 20 -ncurves1 20 -fb1 download/hcn10+3_930.roots1.gz -out test.work/hcn10+3_930.533366000-533368000.gz -t 6 -sqside 1 -adjust-strategy 2 -stats-stderr

    PID USER      PR  NI    VIRT    RES    SHR S  %CPU  %MEM     TIME+ COMMAND  
  15191 math89    20   0 4123972   3.6g   7200 S 551.0  23.3  16:04.17 las   
 
=============================================================
Behavior now:
=============================================================
commit 1d08d4325615139dbe007b43b31e734c4b7dec8d (HEAD -> master, origin/master, origin/HEAD)
Merge: 60f1fe069 0b8ff834d
Author: Emmanuel Thom� <emmanuel.thome@inria.fr>
Date:   Sun Apr 4 19:55:46 2021 +0000

    Merge branch 'lean-ci' into 'master'
    
    simplify ci structure, get rid of intermediary containers.
    
    See merge request cado-nfs/cado-nfs!30

-------client using one process with 2 threads-------
./cado-nfs-client.py --clientid=test --single --bindir=build/$HOSTNAME/ --server=http://math99.local:13573

INFO:root:Running build/math89/sieve/las -poly download/hcn10+3_930.poly -q0 534054000 -I 15 -q1 534056000 -lim0 536000000 -lim1 536000000 -lpb0 32 -lpb1 32 -mfb0 64 -mfb1 94 -ncurves0 20 -ncurves1 20 -fb1 download/hcn10+3_930.roots1.gz -out test.work/hcn10+3_930.534054000-534056000.gz -t 2 -sqside 1 -adjust-strategy 2 -stats-stderr

    PID USER      PR  NI    VIRT    RES    SHR S  %CPU  %MEM     TIME+ COMMAND  
  38473 math89    20   0 7884416   7.3g   7432 S 200.3  47.1  16:06.76 las  
 
----------------------------------------------------------------
-------client using one process with 6 threads-------
./cado-nfs-client.py --clientid=test --single --bindir=build/$HOSTNAME/ --override t 6 --server=http://math99.local:13573

INFO:root:Running build/math89/sieve/las -poly download/hcn10+3_930.poly -q0 534556000 -I 15 -q1 534558000 -lim0 536000000 -lim1 536000000 -lpb0 32 -lpb1 32 -mfb0 64 -mfb1 94 -ncurves0 20 -ncurves1 20 -fb1 download/hcn10+3_930.roots1.gz -out test.work/hcn10+3_930.534556000-534558000.gz -t 6 -sqside 1 -adjust-strategy 2 -stats-stderr

    PID USER      PR  NI    VIRT    RES    SHR S  %CPU  %MEM     TIME+ COMMAND  
  38668 math89    20   0   12.8g  12.5g   7308 S 587.4  80.1  16:05.73 las

==========================================================================

If I use the full 12 cores without the additional 12 threads, it brings the machine to a stand-still and I have to ssh a "pkill las" and wait a few minutes for it to be honored.

Thanks for all your work.

Sincerely,
Edwin Hall




Archive powered by MHonArc 2.6.19+.

Top of Page