Skip to Content.
Sympa Menu

cado-nfs - Re: [Cado-nfs-discuss] Duplicate las sieve processes

Subject: Discussion related to cado-nfs

List archive

Re: [Cado-nfs-discuss] Duplicate las sieve processes


Chronological Thread 
  • From: Zachary Harris <zacharyharris@hotmail.com>
  • To: cado-nfs-discuss@lists.gforge.inria.fr
  • Subject: Re: [Cado-nfs-discuss] Duplicate las sieve processes
  • Date: Mon, 19 Dec 2011 10:04:08 -0500
  • List-archive: <http://lists.gforge.inria.fr/pipermail/cado-nfs-discuss>
  • List-id: A discussion list for Cado-NFS <cado-nfs-discuss.lists.gforge.inria.fr>

  I guess what I probably need to do is just follow this piece of advice from the documentation:
# If you cannot wait for the job to be finished, feel free to kill it after
# having put cores=0 in this file: cadofactor.pl will eventually detect that
# the job is dead and import the partial result file.
I'm planning to kill off all the duplicates, make sure the gz files are in good shape, and then restart cadofactor, first without those cores (hopefully it will import the partial results), and then putting those cores back in.
  It would make me feel better to receive a word of confirmation from someone who knows that this is the right course of action.

-Zach

On 12/19/2011 09:50 AM, Zachary Harris wrote:
Hello,

  Because of having stopped and restarted cadofactor.pl at a point while it was passing out sieve jobs, I have a couple of machines which have duplicate copies of "las" with the same command line running concurrently on a single box. So, for example, we see something like this:
> ps aux | grep las
zach      3594  193  4.7 356744 187868 ?       SNl  Dec18 1181:21 /opt/cado-nfs-1.1/installed/bin/sieve/las -I 13 -poly /tmp/cado-data/myprob.poly -fb /tmp/cado-data/myprob.roots -q0 44000000 -q1 45000000 -mt 2 -out /tmp/cado-data/myprob.rels.44000000-45000000.gz
zach      6285  193  2.5 327896 102612 ?       SNl  Dec18 1159:20 /opt/cado-nfs-1.1/installed/bin/sieve/las -I 13 -poly /tmp/cado-data/myprob.poly -fb /tmp/cado-data/myprob.roots -q0 44000000 -q1 45000000 -mt 2 -out /tmp/cado-data/myprob.rels.44000000-45000000.gz
If I peek into the output files (which seem to be about 1/4 of the way done), they seem "OK" as far as I can tell(???). Namely, it seems things are being done in proper order without duplicate entries. For example:
$ zcat /tmp/cado-data/myprob.rels.44000000-45000000.gz | grep Siev
# Sieving parameters: rlim=16000000 alim=32000000 lpbr=30 lpba=30
# Sieving q=44000009; rho=21111199; a0=1777611; b0=-2; a1=-220133; b1=25
# Sieving q=44000009; rho=16166352; a0=-839375; b0=19; a1=-1829836; b1=-11
# Sieving q=44000009; rho=19239778; a0=163615; b0=-16; a1=2678419; b1=7
# Sieving q=44000023; rho=34914830; a0=-1425942; b0=5; a1=529541; b1=29
# Sieving q=44000083; rho=28079786; a0=-877065; b0=-11; a1=2006678; b1=-25
# Sieving q=44000083; rho=10381365; a0=482873; b0=17; a1=2474623; b1=-4
# Sieving q=44000083; rho=10381632; a0=487412; b0=17; a1=2473555; b1=-4
# Sieving q=44000101; rho=9273155; a0=-189541; b0=-19; a1=-2365674; b1=-5
# Sieving q=44000111; rho=37922759; a0=1458647; b0=7; a1=242764; b1=-29
...
# Sieving q=44256959; rho=22450463; a0=-643967; b0=-2; a1=1199552; b1=-65
# Sieving q=44256959; rho=39638757; a0=768080; b0=19; a1=-1925061; b1=10
# Sieving q=44256997; rho=14965671; a0=-640016; b0=-3; a1=2165351; b1=-59
# Sieving q=44257007; rho=14413283; a0=1017158; b0=-3; a1=1190229; b1=40
# Sieving q=44257013; rho=37630038; a0=231539; b0=20; a1=-2131812; b1=7
# Sieving q=44257027; rho=8141904; a0=1046890; b0=11; a1=1453727; b1=-27
# Sieving q=44257079; rho=41400590; a0=1409744; b0=15; a1=1446745; b1=-16
# Sieving q=44257091; rho=29819662; a0=944804; b0=3; a1=1210173; b1=-43
# Sieving q=44257091; rho=18016890; a0=1570268; b0=5; a1=371971; b1=-27
# Sieving q=44257097; rho=35764170; a0=1323079; b0=-21; a1=1792462; b1=5

However, processes that are duplicated do seem to be making significantly slower progress than the processes that aren't duplicated.

  So, my question is: Can I safely kill off one of the duplicate processes? Should I kill the newer or the older one? Or should I kill off both processes and restart somehow; and if so is there anything I need to do to ensure that I'll be able to make use of the  progress made so far (about 20 hours worth on a couple different machines, and I'm paying for these cloud resources).

Many thanks!

-Zach




Archive powered by MHonArc 2.6.19+.

Top of Page