Skip to Content.
Sympa Menu

cado-nfs - Re: [Cado-nfs-discuss] cado-nfs-2.1.1 crash during filtering

Subject: Discussion related to cado-nfs

List archive

Re: [Cado-nfs-discuss] cado-nfs-2.1.1 crash during filtering


Chronological Thread 
  • From: Greg Marks <marks@member.ams.org>
  • To: Paul Zimmermann <Paul.Zimmermann@inria.fr>
  • Cc: cado-nfs-discuss@lists.gforge.inria.fr
  • Subject: Re: [Cado-nfs-discuss] cado-nfs-2.1.1 crash during filtering
  • Date: Tue, 1 Sep 2015 19:57:25 -0500
  • Authentication-results: mail3-smtp-sop.national.inria.fr; spf=None smtp.pra=marks@member.ams.org; spf=None smtp.mailfrom=marks@member.ams.org; spf=Pass smtp.helo=postmaster@gmarks.org
  • Ironport-phdr: 9a23:2iuKqBw8zplIkfHXCy+O+j09IxM/srCxBDY+r6Qd0e0TIJqq85mqBkHD//Il1AaPBtWHraocw8Pt8IneGkU4qa6bt34DdJEeHzQksu4x2zIaPcieFEfgJ+TrZSFpVO5LVVti4m3peRMNQJW2WVTerzWI4CIIHV2nbEwudrizQtaapv/0/t7x0qWbWx9Piju5bOE6BzSNhiKViPMrh5B/IL060BrDrygAUe1XwWR1OQDbxE6ktY+YtaRu+CVIuv8n69UIEeCjJ/x5HvRkC2EeOn0xrP/qsBzOVw6G4H1UBl0fjx4OMQnA6RzgW573tAP7sPB80W+UJ5ulY6ozXGGN9apkADrhkiQcf2o8+XvUkeR1gadRrQjnrhlkhYnOb9fGZ7JFYqrBcIZCFiJ6VcFLWnkEW9vkYg==
  • List-archive: <http://lists.gforge.inria.fr/pipermail/cado-nfs-discuss/>
  • List-id: A discussion list for Cado-NFS <cado-nfs-discuss.lists.gforge.inria.fr>

Dear Paul,

Sorry for my stupid question; I should have paid closer attention to
the traceback. The problem occurred because I had neglected to change
the slaves.scriptpath variable in my parameters file.

I am happy to report that, after downloading and compiling the git
version just now, everything works. I got the freerel routine to use all
8 of my cores. (Also makefb used all 8 cores--I don't recall whether
that was the case with version 2.1.1--although that step is quite fast
in any event.) As I write this, the git version has reached the lattice
sieving stage without any problems.

Sincerely,
Greg

P.S. To anyone reading this message on the cado-nfs-discuss list: it
is perhaps worth pointing out that the params/params.cXXX files in the
git version are different from those in cado-nfs-2.1.1, and a custom
parameters file modeled on one of them should be updated accordingly.
For example, the tasks.linalg.bwc.mn parameter seems to be obsolete;
the git version uses tasks.linalg.bwc.m and tasks.linalg.bwc.n.

------------------------------------------------
| Greg Marks |
| Department of Mathematics and Computer Science |
| St. Louis University |
| St. Louis, MO 63103-2007 |
| U.S.A. |
| |
| Phone: (314)977-7206 Fax: (314)977-1452 |
| PGP encryption public key ID: 0x53F269E8 |
| Web: http://gmarks.org |
------------------------------------------------

Message from Paul Zimmermann <Paul.Zimmermann@inria.fr>
of August 31, 2015, 09:22:46 +0200, follows:
> Greg,
>
> you are still using the 2.1.1 Python scripts, since the error log says
> "/usr/local/cado-nfs-2.1.1". With the git binaries you should use the new
> scripts from the git version. If I call the factor.sh command in README with
> the git version, then I get in the file c60.freerel.freerel.stdout.1:
>
> zimmerma@tarte:/tmp/cado.hnLhewoFJc$ head -1 c60.freerel.freerel.stdout.1
> # (17a6cc1) /users/caramel/zimmerma/svn/cado-nfs/build/tarte/sieve/freerel
> -poly /tmp/cado.hnLhewoFJc/c60.polyselect2.poly -renumber
> /tmp/cado.hnLhewoFJc/c60.freerel.renumber.gz -lpb0 18 -lpb1 19 -out
> /tmp/cado.hnLhewoFJc/c60.freerel.freerel.gz -t 2
>
> Note that freerel is called with -lpb0 and -lpb1, not with -lpbr and lpba.
>
> In the file cadoprograms.py, around lines 739-740, you should have:
>
> lpbr: Parameter("lpb0", checktype=int),
> lpba: Parameter("lpb1", checktype=int),
>
> Nevertheless, the error message is misleading, I will fix that.
>
> Best regards,
> Paul
>
> > Date: Sun, 30 Aug 2015 21:00:25 -0500
> > From: Greg Marks <marks@member.ams.org>
> > Cc: Paul Zimmermann <Paul.Zimmermann@inria.fr>,
> > cado-nfs-discuss@lists.gforge.inria.fr
> > User-Agent: Mutt/1.5.21 (2010-09-15)
> >
> >
> > [1:text/plain Hide]
> >
> > Dear Pierrick,
> >
> > Thank you for your help. You're quite right: I had forgotten to change
> > the definition of tasks.execpath in my params file to my alternative
> > build directory, created when I compiled the git version.
> >
> > Downloading a new git version today, recompiling from scratch, making
> > sure to use the correct execpath, setting tasks.sieve.freerel.threads=8,
> > I now get this error:
> >
> > Error:Generate Free Relations: Program run on server failed with exit
> > code 1
> > Error:Generate Free Relations: Command line was:
> > /usr/local/cado-nfs-build-git/sieve/freerel -poly
> > /tmp/cado.zCGBRfYgj9/c184.polyselect2.poly -renumber
> > /tmp/cado.zCGBRfYgj9/c184.freerel.renumber.gz -lpbr 32 -lpba 32 -out
> > /tmp/cado.zCGBRfYgj9/c184.freerel.freerel.gz >
> > /tmp/cado.zCGBRfYgj9/c184.freerel.freerel.stdout.1 2>
> > /tmp/cado.zCGBRfYgj9/c184.freerel.freerel.stderr.1
> > Error:Generate Free Relations: Stderr output follows (stored in file
> > /tmp/cado.zCGBRfYgj9/c184.freerel.freerel.stderr.1):
> > b'# Warning: parameter verbose_flags is checked by this program but is
> > undocumented.\nError, missing -lpbr or -lpba command line
> > argument\nUsage: /usr/local/cado-nfs-build-git/sieve/freerel
> > <parameters>\nThe available parameters are the following:\n -poly
> > input polynomial file\n -renumber output file for renumbering table\n
> > -out output file for free relations\n -lpb0 large prime
> > bound on side 0\n -lpb1 large prime bound on side 1\n -pmin
> > do not create freerel below this bound\n -pmax do not create
> > freerel beyond this bound\n -badideals file describing bad ideals (for
> > DL)\n -addfullcol (switch) add a column of 1 in the matrix (for DL)\n
> > -t number of threads\n'
> > Traceback (most recent call last):
> > File "/usr/local/cado-nfs-2.1.1/scripts/cadofactor/cadofactor.py",
> > line 72, in <module>
> > factors = factorjob.run()
> > File "/usr/local/cado-nfs-2.1.1/scripts/cadofactor/cadotask.py",
> > line 4720, in run
> > last_status, last_task = self.run_next_task()
> > File "/usr/local/cado-nfs-2.1.1/scripts/cadofactor/cadotask.py",
> > line 4788, in run_next_task
> > return [task.run(), task.title]
> > File "/usr/local/cado-nfs-2.1.1/scripts/cadofactor/cadotask.py",
> > line 2367, in run
> > raise Exception("Program failed")
> > Exception: Program failed
> >
> > This is running with Linux Centos 6 x86_64. Reverting to cado-nfs-2.1.1,
> > I have no problems.
> >
> > Sincerely,
> > Greg
> >
> > ------------------------------------------------
> > | Greg Marks |
> > | Department of Mathematics and Computer Science |
> > | St. Louis University |
> > | St. Louis, MO 63103-2007 |
> > | U.S.A. |
> > | |
> > | Phone: (314)977-7206 Fax: (314)977-1452 |
> > | PGP encryption public key ID: 0x53F269E8 |
> > | Web: http://gmarks.org |
> > ------------------------------------------------
> >
> > Message from Pierrick Gaudry <pierrick.gaudry@loria.fr>
> > of August 19, 2015, 08:40:04 +0200, follows:
> > > It seems that you used a new cadofactor script with an old makefb
> > > binary.
> > > I suggest that you recompile everything from scratch with the new
> > > downloaded version.
> > >
> > > Also, there are currently many changes occurring in the git version,
> > > sometimes breaking everything. It is highly recommended to check on
> > > https://ci.inria.fr/cado/
> > > to see if there are failures with the current git version for your
> > > architecture / OS. For instance, at the time of writing this message, it
> > > is a very bad idea to git pull if you have a Mac.
> > >
> > > Pierrick
> > >
> > >
> > > On Tue, Aug 18, 2015 at 11:57:50PM -0500, Greg Marks wrote:
> > > > Dear Paul,
> > > >
> > > > I'm encountering into a problem with the git version, downloaded today
> > > > from here:
> > > >
> > > > git clone
> > > > https://scm.gforge.inria.fr/anonscm/git/cado-nfs/cado-nfs.git
> > > >
> > > > Running the program with the same parameter settings as before, it
> > > > terminates with this error:
> > > >
> > > > ...
> > > > Info:Generate Factor Base: Starting
> > > > Warning:Command: Process with PID 9943 finished with return code 1
> > > > Error:Generate Factor Base: Program run on server failed with exit
> > > > code 1
> > > > Error:Generate Factor Base: Command line was:
> > > > /usr/local/cado-nfs-build/sieve/makefb -poly
> > > > /tmp/cado.zCGBRfYgj9/c184.polyselect2.poly -lim 80000000 -maxbits 14
> > > > -out /tmp/cado.zCGBRfYgj9/c184.factorbase.roots.gz -t 8 >
> > > > /tmp/cado.zCGBRfYgj9/c184.factorbase.makefb.stdout.1 2>
> > > > /tmp/cado.zCGBRfYgj9/c184.factorbase.makefb.stderr.1
> > > > Error:Generate Factor Base: Stderr output follows (stored in file
> > > > /tmp/cado.zCGBRfYgj9/c184.factorbase.makefb.stderr.1):
> > > > b'Error: parameter -alim is mandatory\nUsage:
> > > > /usr/local/cado-nfs-build/sieve/makefb <parameters>\nThe available
> > > > parameters are the following:\n -poly polynomial file\n
> > > > -alim factor base bound\n -maxbits (optional) maximal
> > > > number of bits of powers\n -out (optional) name of the
> > > > output file\n -side (optional) create factor base for given
> > > > side. Side must be 0 or 1 (default is 1, i.e. algebraic).\n'
> > > > Traceback (most recent call last):
> > > > File "/usr/local/cado-nfs/scripts/cadofactor/cadofactor.py",
> > > > line 81, in <module>
> > > > factors = factorjob.run()
> > > > File "/usr/local/cado-nfs/scripts/cadofactor/cadotask.py", line
> > > > 5251, in run
> > > > last_status, last_task = self.run_next_task()
> > > > File "/usr/local/cado-nfs/scripts/cadofactor/cadotask.py", line
> > > > 5329, in run_next_task
> > > > return [task.run(), task.title]
> > > > File "/usr/local/cado-nfs/scripts/cadofactor/cadotask.py", line
> > > > 2453, in run
> > > > raise Exception("Program failed")
> > > > Exception: Program failed
> > > >
> > > > I'm using a copy of /usr/local/cado-nfs/params/params.c186 (with extra
> > > > settings appended) as the argument to cadofactor.py, which includes
> > > > this definition:
> > > >
> > > > alim = 80000000
> > > >
> > > > Using cado-nfs-2.1.1 with the same params file, there are no problems.
> > > >
> > > > Sincerely,
> > > > Greg
> > > >
> > > > ------------------------------------------------
> > > > | Greg Marks |
> > > > | Department of Mathematics and Computer Science |
> > > > | St. Louis University |
> > > > | St. Louis, MO 63103-2007 |
> > > > | U.S.A. |
> > > > | |
> > > > | Phone: (314)977-7206 Fax: (314)977-1452 |
> > > > | PGP encryption public key ID: 0x53F269E8 |
> > > > | Web: http://gmarks.org |
> > > > ------------------------------------------------
> > > >
> > > > Message from paul zimmermann <Paul.Zimmermann@inria.fr>
> > > > of August 17, 2015, 18:14:39 +0200, follows:
> > > > > Greg,
> > > > >
> > > > > it seems to work with the git version (just fixed the fact that the
> > > > > command
> > > > > line was not printed in c60.freerel.freerel.stdout.1):
> > > > >
> > > > > zimmerma@tarte:~/svn/cado-nfs$ ./factor.sh
> > > > > 90377629292003121684002147101760858109247336549001090677693
> > > > > tasks.sieve.freerel.threads=3
> > > > > ...
> > > > > zimmerma@tarte:~/svn/cado-nfs$ head -1
> > > > > /tmp/cado.hDYwnLCKdk/c60.freerel.freerel.stdout.1
> > > > > # (747a3bd+)
> > > > > /users/caramel/zimmerma/svn/cado-nfs/build/tarte/sieve/freerel
> > > > > -poly /tmp/cado.hDYwnLCKdk/c60.polyselect2.poly -renumber
> > > > > /tmp/cado.hDYwnLCKdk/c60.freerel.renumber.gz -lpb0 18 -lpb1 19 -out
> > > > > /tmp/cado.hDYwnLCKdk/c60.freerel.freerel.gz -t 3
> > > > >
> > > > > Please can you check again (with the latest git version) and report
> > > > > here?
> > > > >
> > > > > Paul
> > > > >
> > > > > > Date: Sat, 8 Aug 2015 14:05:16 -0500
> > > > > > From: Greg Marks <marks@member.ams.org>
> > > > > >
> > > > > > Dear Paul,
> > > > > >
> > > > > > Thank you for appended message containing the patch. I did change
> > > > > > the dup2.c file as indicated in your message below, and
> > > > > > recompiled.
> > > > > > The program has been sieving successfully for perhaps a month
> > > > > > since
> > > > > > then, and I have not encountered this zero-byte problem again.
> > > > > > So this
> > > > > > probably means that problem is solved.
> > > > > >
> > > > > > Incidentally, in another message you suggested that one could
> > > > > > speed up
> > > > > > the free relation calculation when restarting a computation in a
> > > > > > fresh
> > > > > > working directory with the command:
> > > > > >
> > > > > > cadofactor.py ... tasks.sieve.freerel.threads=3
> > > > > >
> > > > > > I've tried this a few times, including with other numbers in
> > > > > > place of
> > > > > > 3, but the freerel routine uses only one core no matter what I
> > > > > > try.
> > > > > > Admittedly this isn't that important, since it's only an extra
> > > > > > hour
> > > > > > or two from time to time in a months-long computation. (There is
> > > > > > a
> > > > > > sort of psychological reason to want to speed this step along:
> > > > > > what's
> > > > > > usually happened is that I've deleted some corrupted sieving file
> > > > > > and
> > > > > > am anxious to see whether las is going to restart without errors.)
> > > > > >
> > > > > > Sincerely,
> > > > > > Greg
> > > > > >
> > > > > > ------------------------------------------------
> > > > > > | Greg Marks |
> > > > > > | Department of Mathematics and Computer Science |
> > > > > > | St. Louis University |
> > > > > > | St. Louis, MO 63103-2007 |
> > > > > > | U.S.A. |
> > > > > > | |
> > > > > > | Phone: (314)977-7206 Fax: (314)977-1452 |
> > > > > > | PGP encryption public key ID: 0x53F269E8 |
> > > > > > | Web: http://gmarks.org |
> > > > > > ------------------------------------------------
> > > > > >
> > > > > > Message from paul zimmermann <Paul.Zimmermann@inria.fr>
> > > > > > of June 29, 2015, 18:03:18 +0200, follows:
> > > > > > > > The program now terminates with the error message appended at
> > > > > > > > the bottom
> > > > > > > > of this message. The file
> > > > > > > > $WORKDIR/c184.duplicates1//0/dup1.1.0000.gz
> > > > > > > > mentioned in the stderr below turns out to be a 20 byte file,
> > > > > > > > which,
> > > > > > > > when gunzipped, is a zero byte file.
> > > > > > >
> > > > > > > it is a bug that a zero byte file is not accepted in dup2. I
> > > > > > > fixed this in
> > > > > > > commit 73aa354 (review welcome).
> > > > > > >
> > > > > > > Greg, please could you try again with this change?
> > > > > > >
> > > > > > > --- a/filter/dup2.c
> > > > > > > +++ b/filter/dup2.c
> > > > > > > @@ -432,11 +432,18 @@ int
> > > > > > > check_whether_file_is_renumbered(const char * filename,
> > > > > > > unsigned int npoly)
> > > > > > > unsigned int count = 0;
> > > > > > > char s[1024];
> > > > > > > FILE *f_tmp = fopen_maybe_compressed (filename, "rb");
> > > > > > > +
> > > > > > > if (!f_tmp) {
> > > > > > > fprintf(stderr, "%s: %s\n", filename, strerror(errno));
> > > > > > > abort();
> > > > > > > }
> > > > > > >
> > > > > > > + if (feof (f_tmp)) /* file is empty */
> > > > > > > + {
> > > > > > > + fclose_maybe_compressed (f_tmp, filename);
> > > > > > > + return 1; /* an empty file might be considered as
> > > > > > > renumbered */
> > > > > > > + }
> > > > > > > +
> > > > > > > /* Look for first non-comment line */
> > > > > > > while (1) {
> > > > > > > char *ret = fgets (s, 1024, f_tmp);
> > > > > > >
> > > > > > > Paul
> > > > > >
> > > > > > [2:application/pgp-signature Show Save:signature.asc (819B)]
> > > > > >
> > >
> > >
> > >
> > > > _______________________________________________
> > > > Cado-nfs-discuss mailing list
> > > > Cado-nfs-discuss@lists.gforge.inria.fr
> > > > http://lists.gforge.inria.fr/mailman/listinfo/cado-nfs-discuss
> >
> > [2:application/pgp-signature Show Save:signature.asc (819B)]
> >

Attachment: signature.asc
Description: Digital signature




Archive powered by MHonArc 2.6.19+.

Top of Page