Skip to Content.
Sympa Menu

cado-nfs - Re: [Cado-nfs-discuss] Duplicate las sieve processes

Subject: Discussion related to cado-nfs

List archive

Re: [Cado-nfs-discuss] Duplicate las sieve processes


Chronological Thread 
  • From: Emmanuel Thomé <Emmanuel.Thome@gmail.com>
  • To: cado-nfs-discuss@lists.gforge.inria.fr
  • Subject: Re: [Cado-nfs-discuss] Duplicate las sieve processes
  • Date: Mon, 19 Dec 2011 16:10:05 +0100
  • List-archive: <http://lists.gforge.inria.fr/pipermail/cado-nfs-discuss>
  • List-id: A discussion list for Cado-NFS <cado-nfs-discuss.lists.gforge.inria.fr>

On Mon, Dec 19, 2011 at 10:04:08AM -0500, Zachary Harris wrote:
> I guess what I probably need to do is just follow this piece of advice
> from the documentation:
>
> # If you cannot wait for the job to be finished, feel free to kill
> it after
> # having put cores=0 in this file: cadofactor.pl will eventually
> detect that
> # the job is dead and import the partial result file.
>
> I'm planning to kill off all the duplicates, make sure the gz files are
> in good shape, and then restart cadofactor, first without those cores
> (hopefully it will import the partial results), and then putting those
> cores back in.

It won't kill your computation, sure. But I'm not sure it collects the
partial results.

> It would make me feel better to receive a word of confirmation from
> someone who knows that this is the right course of action.

Best,

E.

>
> -Zach
>
> On 12/19/2011 09:50 AM, Zachary Harris wrote:
> > Hello,
> >
> > Because of having stopped and restarted cadofactor.pl at a point
> > while it was passing out sieve jobs, I have a couple of machines which
> > have duplicate copies of "las" with the same command line running
> > concurrently on a single box. So, for example, we see something like this:
> >
> > > ps aux | grep las
> > zach 3594 193 4.7 356744 187868 ? SNl Dec18 1181:21
> > /opt/cado-nfs-1.1/installed/bin/sieve/las -I 13 -poly
> > /tmp/cado-data/myprob.poly -fb /tmp/cado-data/myprob.roots -q0
> > 44000000 -q1 45000000 -mt 2 -out
> > /tmp/cado-data/myprob.rels.44000000-45000000.gz
> > zach 6285 193 2.5 327896 102612 ? SNl Dec18 1159:20
> > /opt/cado-nfs-1.1/installed/bin/sieve/las -I 13 -poly
> > /tmp/cado-data/myprob.poly -fb /tmp/cado-data/myprob.roots -q0
> > 44000000 -q1 45000000 -mt 2 -out
> > /tmp/cado-data/myprob.rels.44000000-45000000.gz
> >
> > If I peek into the output files (which seem to be about 1/4 of the way
> > done), they seem "OK" as far as I can tell(???). Namely, it seems
> > things are being done in proper order without duplicate entries. For
> > example:
> >
> > $ zcat /tmp/cado-data/myprob.rels.44000000-45000000.gz | grep Siev
> > # Sieving parameters: rlim=16000000 alim=32000000 lpbr=30 lpba=30
> > # Sieving q=44000009; rho=21111199; a0=1777611; b0=-2; a1=-220133;
> > b1=25
> > # Sieving q=44000009; rho=16166352; a0=-839375; b0=19;
> > a1=-1829836; b1=-11
> > # Sieving q=44000009; rho=19239778; a0=163615; b0=-16; a1=2678419;
> > b1=7
> > # Sieving q=44000023; rho=34914830; a0=-1425942; b0=5; a1=529541;
> > b1=29
> > # Sieving q=44000083; rho=28079786; a0=-877065; b0=-11;
> > a1=2006678; b1=-25
> > # Sieving q=44000083; rho=10381365; a0=482873; b0=17; a1=2474623;
> > b1=-4
> > # Sieving q=44000083; rho=10381632; a0=487412; b0=17; a1=2473555;
> > b1=-4
> > # Sieving q=44000101; rho=9273155; a0=-189541; b0=-19;
> > a1=-2365674; b1=-5
> > # Sieving q=44000111; rho=37922759; a0=1458647; b0=7; a1=242764;
> > b1=-29
> > ...
> > # Sieving q=44256959; rho=22450463; a0=-643967; b0=-2; a1=1199552;
> > b1=-65
> > # Sieving q=44256959; rho=39638757; a0=768080; b0=19; a1=-1925061;
> > b1=10
> > # Sieving q=44256997; rho=14965671; a0=-640016; b0=-3; a1=2165351;
> > b1=-59
> > # Sieving q=44257007; rho=14413283; a0=1017158; b0=-3; a1=1190229;
> > b1=40
> > # Sieving q=44257013; rho=37630038; a0=231539; b0=20; a1=-2131812;
> > b1=7
> > # Sieving q=44257027; rho=8141904; a0=1046890; b0=11; a1=1453727;
> > b1=-27
> > # Sieving q=44257079; rho=41400590; a0=1409744; b0=15; a1=1446745;
> > b1=-16
> > # Sieving q=44257091; rho=29819662; a0=944804; b0=3; a1=1210173;
> > b1=-43
> > # Sieving q=44257091; rho=18016890; a0=1570268; b0=5; a1=371971;
> > b1=-27
> > # Sieving q=44257097; rho=35764170; a0=1323079; b0=-21;
> > a1=1792462; b1=5
> >
> > However, processes that are duplicated do seem to be making
> > significantly slower progress than the processes that aren't duplicated.
> >
> > So, my question is: Can I safely kill off one of the duplicate
> > processes? Should I kill the newer or the older one? Or should I kill
> > off both processes and restart somehow; and if so is there anything I
> > need to do to ensure that I'll be able to make use of the progress
> > made so far (about 20 hours worth on a couple different machines, and
> > I'm paying for these cloud resources).
> >
> > Many thanks!
> >
> > -Zach
>

> _______________________________________________
> Cado-nfs-discuss mailing list
> Cado-nfs-discuss@lists.gforge.inria.fr
> http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/cado-nfs-discuss






Archive powered by MHonArc 2.6.19+.

Top of Page