Skip to Content.
Sympa Menu

cado-nfs - Re: [cado-nfs] A couple of bugs (sopt + lingen)

Subject: Discussion related to cado-nfs

List archive

Re: [cado-nfs] A couple of bugs (sopt + lingen)


Chronological Thread 
  • From: Emmanuel Thomé <Emmanuel.Thome@inria.fr>
  • To: Robert Balfour <rhb11931@gmail.com>
  • Cc: cado-nfs@inria.fr
  • Subject: Re: [cado-nfs] A couple of bugs (sopt + lingen)
  • Date: Wed, 31 Aug 2022 22:43:03 +0200
  • Authentication-results: mail2-relais-roc.national.inria.fr; dkim=none (message not signed) header.i=none; spf=SoftFail smtp.mailfrom=Emmanuel.Thome@inria.fr; dmarc=fail (p=none dis=none) d=inria.fr

Thanks. I now have a reproducer with OMP_NUM_THREADS=2 (on the 112 vcpu
machine I'm using for tests). This also happens with a local filesystem.

I should be able to tell you more shortly.

Follow up here:

https://gitlab.inria.fr/cado-nfs/cado-nfs/-/issues/30047

Best,

E.


On Wed, Aug 31, 2022 at 12:14:53PM -0700, Robert Balfour wrote:
> Dear Emmanuel,
>
> The "Read 1086 coefficients (18.4%)" is reproducible for me, it's not just
> a one-off. I actually didn't discover the bug myself - it was first
> reported by a mersenneforum user on a 136-digit composite, but he wasn't
> forthcoming with the output files I asked him for, so I decided to
> investigate.
>
> It's an NFS filesystem, rather appropriately.
>
> Requested files are attached. The sha256sum of A0-64.0-5888
> is 637573bd8b95a89188ceb3471b6a30f0144eb639b8baefccde3f47f3ef7a4711.
>
> Best regards,
> Robert
>
> On Wed, Aug 31, 2022 at 7:48 AM Emmanuel Thomé <Emmanuel.Thome@inria.fr>
> wrote:
>
> > On Mon, Aug 15, 2022 at 12:44:43AM +0100, Robert Balfour wrote:
> > > Strangely enough, if CADO is resumed then it finds the 64 lucky columns
> > > it needs and the factorization completes successfully.
> >
> > Hold on, there's something definitely weird in your output files.
> >
> > 1st try:
> >
> > [...]
> > A0-64.0-5888
> > [...]
> > Read 1086 coefficients (18.4%) in 0.0 s (37.4 MB/s)
> > 0* [1086, recursive] 0/1
> > [...]
> > Final, t=1087: delta = 544 [62] 545 [2] 544 [62] 543 [2]
> >
> >
> > 2nd try:
> > Read 2048 coefficients (34.8%) in 0.0 s (85.9 MB/s)
> > Read 5886 coefficients (100.0%) in 0.1 s (87.2 MB/s)
> > 0* [5886, recursive] 0/1
> > [...]
> > t=5745, canceled columns: 76 79-82 84-126.
> > t=5746, canceled columns: 64-127.
> > t=5747, canceled columns: 64-127.
> > t=5748, canceled columns: 64-127, complete generator found, for sure.
> > Final, t=5748: delta = 2876 [13] 2877 2876 [2] 2877 2876 2877 [46] 2873
> > [12] 2872 2873 [2] 2872 [4] 2873 2872 [43] 2873
> >
> >
> > So on the first try, lingen was only able to read 1086 coefficients from
> > the file A0-64.0-5888, out of the 5888 coefficients that it should have
> > read ; I guess that some I/O call returned a failure of some kind, and
> > that caused a short read.
> >
> > - there's probably an error code that is mistakenly ignored in the
> > lingen code. I'll look into that.
> >
> > - as to why a read from a file that has just been created returns an
> > error (even transient), it can be all kinds of filesystem issues, I
> > guess. What kind of filesystem do you have?
> >
> > Best,
> >
> > E.
> >
> >
> >
> > >
> > > The relevant log files are attached: stdout.1 and stderr.1 are from the
> > > original run that crashed, stdout.2 and stderr.2 are from the
> > > resumption.
> > >
> > > Best regards,
> > > Robert
> >
> >
> >
> >
> >
> >
> > --
> > Pour une évaluation indépendante, transparente et rigoureuse !
> > Je suis à la CE de l'Inria pour y apporter ma contribution.
> >





--
Pour une évaluation indépendante, transparente et rigoureuse !
Je suis à la CE de l'Inria pour y apporter ma contribution.



Archive powered by MHonArc 2.6.19+.

Top of Page