Skip to Content.
Sympa Menu

cado-nfs - Re: [Cado-nfs-discuss] Restarting BWC with different # threads

Subject: Discussion related to cado-nfs

List archive

Re: [Cado-nfs-discuss] Restarting BWC with different # threads


Chronological Thread 
  • From: Paul Leyland <paul.leyland@gmail.com>
  • To: paul zimmermann <Paul.Zimmermann@inria.fr>
  • Cc: cado-nfs-discuss@lists.gforge.inria.fr
  • Subject: Re: [Cado-nfs-discuss] Restarting BWC with different # threads
  • Date: Mon, 20 Apr 2015 12:29:52 +0100
  • List-archive: <http://lists.gforge.inria.fr/pipermail/cado-nfs-discuss/>
  • List-id: A discussion list for Cado-NFS <cado-nfs-discuss.lists.gforge.inria.fr>

On Mon, 2015-04-20 at 13:00 +0200, paul zimmermann wrote:
> Dear Paul,
>
> thank you for the log files. The right way is not to manually modify the
> param
> file, but to add options to the cadofactor.py command line. I've tried the
> following with the current git version:
>
> 1) ./factor.sh
> 4150325003953540491644898572994268014120159229862228308634769166153949
>
> 2) kill the process during the linear algebra
>
> 3) grep cadofactor.py in the c70.log file, and append
> tasks.linalg.bwc.threads=2x2 at the end:
>
> $ /users/caramel/zimmerma/svn/cado-nfs/scripts/cadofactor/cadofactor.py
> /tmp/cado.9vIWPtttCt/param
> N=4150325003953540491644898572994268014120159229862228308634769166153949
> tasks.execpath=/users/caramel/zimmerma/svn/cado-nfs/build/tarte
> tasks.threads=1 tasks.workdir=/tmp/cado.9vIWPtttCt
> slaves.hostnames=localhost slaves.nrclients=1
> slaves.scriptpath=/users/caramel/zimmerma/svn/cado-nfs/scripts/cadofactor
> server.address=localhost slaves.basepath=/tmp/cado.9vIWPtttCt/client/
> tasks.linalg.bwc.cpubinding=/users/caramel/zimmerma/svn/cado-nfs/linalg/bwc/cpubinding.conf
> tasks.linalg.bwc.threads=2x2
>
> and it finished without any error:
>
> Info:Complete Factorization: Total cpu/elapsed time for entire
> factorization: 195.99/394.786
> 57544089887737977040578587881401331 72124261797351492142294585373264879
>
> Please can you confirm it works on your side?

Hi Paul,

Unfortunately it doesn't work, assuming I followed your instructions
correctly. First is the screen shot containing everything between
killing the run and the recommended grep and the capturing of the
screen:

8<----------------------------------------------------------------------->8
maat ~/nums/cado_fact $
[1]+ Terminated ~/nums/c*s/scripts/cadofactor/cadofactor.py c6*s
pcl@maat ~/nums/cado_fact $ ls
c2-797.params c6-345.params~ params.2_946.c130 params.5_340.c152
params.c91 work
c6-345.params params.11_264.c137 params.2_946.c130~ params.9_296.c152
tester.par
pcl@maat ~/nums/cado_fact $ grep cadofactor.py w*og
grep: w*og: No such file or directory
pcl@maat ~/nums/cado_fact $ grep cadofactor.py w*/*og
PID6243 2015-04-04 19:21:07,440 Info:root: Command line parameters:
/home/pcl/nums/cado-nfs/scripts/cadofactor/cadofactor.py c6-345.params
PID13998 2015-04-16 09:26:18,128 Info:root: Command line parameters:
/home/pcl/nums/cado-nfs/scripts/cadofactor/cadofactor.py c6-345.params
PID13998 2015-04-16 09:26:18,986 Warning:Complete Factorization: The start
time of the last cadofactor.py run was recorded, but not its end time, maybe
because it died unexpectedly.
PID14080 2015-04-16 09:28:23,941 Info:root: Command line parameters:
/home/pcl/nums/cado-nfs/scripts/cadofactor/cadofactor.py c6-345.params
PID14080 2015-04-16 09:28:24,685 Warning:Complete Factorization: The start
time of the last cadofactor.py run was recorded, but not its end time, maybe
because it died unexpectedly.
pcl@maat ~/nums/cado_fact $
/home/pcl/nums/cado-nfs/scripts/cadofactor/cadofactor.py c6-345.params
tasks.linalg.bwc.threads=2x2
Info:root: Command line parameters:
/home/pcl/nums/cado-nfs/scripts/cadofactor/cadofactor.py c6-345.params
'tasks.linalg.bwc.threads=2x2'
Info:Server Launcher: Adding maat.home.brnikat.com to whitelist to allow
clients on localhost to connect
Info:HTTP server: Using non-threaded HTTPS server
Info:HTTP server: Using whitelist:
192.168.1.3,localhost,192.168.1.0/24,localhost,maat.home.brnikat.com
Warning:Complete Factorization: Parameter sourcedir =
/home/pcl/nums/cado-nfs/ was not used anywhere
Info:Complete Factorization: Factoring
726963666889756270439831124137652409588166392285606791962251219514352986840188163853738237343537308637132102683314584642847874085671459624004763984549261
Warning:Complete Factorization: The start time of the last cadofactor.py run
was recorded, but not its end time, maybe because it died unexpectedly.
Warning:Complete Factorization: Elapsed time of last run is not known and
will not be counted towards total.
Info:HTTP server: serving at https://maat.home.brnikat.com:8001 (0.0.0.0)
Info:HTTP server: For debugging purposes, the URL above can be accessed if
the server.only_registered=False parameter is added
Info:HTTP server: You can start additional wuclient2.py scripts with
parameters: --server=https://maat.home.brnikat.com:8001
--certsha1=36de1fddc1e37b4845e86967b77ec6158a372fc2
Info:HTTP server: If you want to start additional clients, remember to add
their hosts to server.whitelist
Info:Client Launcher: Starting client id localhost on host localhost
Info:Client Launcher: Starting client id localhost+2 on host localhost
Info:Client Launcher: Starting client id localhost+3 on host localhost
Info:Client Launcher: Starting client id localhost+4 on host localhost
Info:Client Launcher: Starting client id 192.168.1.3 on host 192.168.1.3
Info:Client Launcher: Starting client id 192.168.1.3+2 on host 192.168.1.3
Info:Client Launcher: Starting client id 192.168.1.3+3 on host 192.168.1.3
Info:Client Launcher: Running clients: localhost+3 (Host localhost, PID
12495), localhost+2 (Host localhost, PID 11953), 192.168.1.3 (Host
192.168.1.3, PID 2146), localhost+4 (Host localhost, PID 12\
516), localhost (Host localhost, PID 11862), 192.168.1.3+3 (Host 192.168.1.3,
PID 2192), 192.168.1.3+2 (Host 192.168.1.3, PID 2169)
Info:Polynomial Selection (size optimized): Starting
Info:Polynomial Selection (size optimized): 900 polynomials in queue from
previous run, worst lognorm 48.530000
Info:Polynomial Selection (size optimized): Already finished - nothing to do
Info:Polynomial Selection (root optimized): Starting
Info:Polynomial Selection (root optimized): Best polynomial previously found
in
/home/pcl/nums/cado_fact/work/c6-345.upload/c6-345.polyselect2.67c4o1.opt_545
has Murphy_E = 1.93e-10
Info:Polynomial Selection (root optimized): Already finished - nothing to do
Info:Polynomial Selection (root optimized): Best overall polynomial was 5-th
in list after size optimization
Info:Generate Factor Base: Starting
Info:Generate Free Relations: Starting
Info:Lattice Sieving: Starting
Info:Lattice Sieving: Reached target of 60000000 relations, now have 66015553
Info:Filtering - Duplicate Removal, splitting pass: Starting
Info:Filtering - Duplicate Removal, splitting pass: No new files to split
Info:Filtering - Duplicate Removal, splitting pass: Relations per slice: 0:
33010212, 1: 33005341
Info:Filtering - Duplicate Removal, removal pass: Starting
Info:Filtering - Duplicate Removal, removal pass: No new files for slice 0,
nothing to do
Info:Filtering - Duplicate Removal, removal pass: No new files for slice 1,
nothing to do
Info:Filtering - Duplicate Removal, removal pass: 49879192 unique relations
remain in total
Info:Filtering - Singleton removal: Starting
Info:Filtering - Singleton removal: Already have a purged file, and no new
input relations available. Nothing to do
Info:HTTP server: Got notification to stop serving Workunits
Info:Lattice Sieving: Cancelling remaining workunits
Info:Client Launcher: Stopped client localhost+3 (Host localhost, PID 12495)
Info:Client Launcher: Stopped client localhost+2 (Host localhost, PID 11953)
Info:Client Launcher: Stopped client 192.168.1.3 (Host 192.168.1.3, PID 2146)
Info:Client Launcher: Stopped client localhost+4 (Host localhost, PID 12516)
Info:Client Launcher: Stopped client localhost (Host localhost, PID 11862)
Info:Client Launcher: Stopped client 192.168.1.3+3 (Host 192.168.1.3, PID
2192)
Info:Client Launcher: Stopped client 192.168.1.3+2 (Host 192.168.1.3, PID
2169)
Info:Filtering - Merging: Starting
Info:Linear Algebra: Starting
Warning:Command: Process with PID 14057 finished with return code 255
Error:Linear Algebra: Program run on server failed with exit code 255
Error:Linear Algebra: Command line was:
/home/pcl/nums/cado-nfs/build/maat.home.brnikat.com/linalg/bwc/bwc.pl
:complete -v 'thr=2x2' 'mn=64' 'nullspace=left' 'interval=1000'
'matrix=/home/pcl/nums/c\
ado_fact/work/c6-345.merge.sparse.bin'
'wdir=/home/pcl/nums/cado_fact/work/c6-345.bwc' 'interleaving=0'
'shuffled_product=1' > /home/pcl/nums/cado_fact/work/c6-345.bwc.bwc.stdout.4
2> /home/pcl/nums\
/cado_fact/work/c6-345.bwc.bwc.stderr.4
Error:Linear Algebra: Stderr output follows (stored in file
/home/pcl/nums/cado_fact/work/c6-345.bwc.bwc.stderr.4):
b'Expected 1 bfile, found 0:\nDied at
/home/pcl/nums/cado-nfs/build/maat.home.brnikat.com/linalg/bwc/bwc.pl line
774.\n'
Traceback (most recent call last):
File "/home/pcl/nums/cado-nfs/scripts/cadofactor/cadofactor.py", line 81,
in <module>
factors = factorjob.run()
File "/home/pcl/nums/cado-nfs/scripts/cadofactor/cadotask.py", line 4642,
in run
last_status, last_task = self.run_next_task()
File "/home/pcl/nums/cado-nfs/scripts/cadofactor/cadotask.py", line 4710,
in run_next_task
return [task.run(), task.title]
File "/home/pcl/nums/cado-nfs/scripts/cadofactor/cadotask.py", line 3626,
in run
raise Exception("Program failed")
Exception: Program failed
pcl@maat ~/nums/cado_fact $ cat > screen

8<----------------------------------------------------------------------->8

Then the latest stdout.4 file (you already have all the previous three)

8<----------------------------------------------------------------------->8


/home/pcl/nums/cado-nfs/build/maat.home.brnikat.com/linalg/bwc/bwc.pl
:complete -v thr=2x2 mn=64 nullspace=left interval=1000
matrix=/home/pcl/nums/cado_fact/work/c6-345.merge.sparse.bin wdir=/home/\
pcl/nums/cado_fact/work/c6-345.bwc interleaving=0 shuffled_product=1
$VAR1 = [
'wdir=/home/pcl/nums/cado_fact/work/c6-345.bwc',
'interleaving=0',
'mn=64',
'thr=2x2',
'-v',
'interval=1000',
'nullspace=left',
'splits=0,64',
'ys=0..64',
'matrix=/home/pcl/nums/cado_fact/work/c6-345.merge.sparse.bin'
];
Running find /home/pcl/nums/cado_fact/work/c6-345.bwc -name
c6-345.merge.sparse.2x2.????????.bin

8<----------------------------------------------------------------------->8

and the stderr.4 file

8<----------------------------------------------------------------------->8

Expected 1 bfile, found 0:
Died at /home/pcl/nums/cado-nfs/build/maat.home.brnikat.com/linalg/bwc/bwc.pl
line 774.

8<----------------------------------------------------------------------->8

I'll restart with the vanilla command because I'd like this
factorization to complete before I go on a trip on Wednesday. The ETA
was just after midnight Tue/Wed ...



Regards,
Paul (the other one)








Archive powered by MHonArc 2.6.19+.

Top of Page