Skip to Content.
Sympa Menu

cado-nfs - Re: [cado-nfs] CADO-NFS LA Errors

Subject: Discussion related to cado-nfs

List archive

Re: [cado-nfs] CADO-NFS LA Errors


Chronological Thread 
  • From: Emmanuel Thomé <Emmanuel.Thome@inria.fr>
  • To: Tyler Busby <tylerbusby333@gmail.com>
  • Cc: cado-nfs@inria.fr
  • Subject: Re: [cado-nfs] CADO-NFS LA Errors
  • Date: Mon, 27 May 2024 17:03:51 +0200
  • Authentication-results: mail3-smtp-sop.national.inria.fr; spf=None smtp.pra=Emmanuel.Thome@inria.fr; spf=Pass smtp.mailfrom=emmanuel.thome@elliptic.org; spf=None smtp.helo=postmaster@mail-lf1-f54.google.com
  • Ironport-data: A9a23:0oWYaqD7WRRpPhVW/9/nw5YqxClBgxIJ4kV8jS/XYbTApD930mQHm 2dMC2/SbvqCZDSnLotxb4u19UJXv8KGmt5nOVdlrnsFo1Bi+ZOUX4zBRqvTF3rPdZObFBoPA +E2MISowBUcFyeEzvuVGuG96yM6j8lkf5KkYMbcICd9WAR4fykojBNnioYRj5Vh6TSDK1rlV eja/YuHaDdJ5xYuajhIs//Z90s21BjPkGpwUmIWNagjUGD2zCF94KI3fcmZM3b+S49IKe+2L 86rIGaRows1Vz90Yj+Uuu6Tnn8iGtY+DiDS4pZiYJVOtzAZzsAEPgnXA9JHAatfo23hc9mcU 7yhv7ToIesiFvWkdOjwz3C0usyxVEFL0OavHJSxjSCc5xz8VlW13s9lNX4/FtBE09lsJ2RV9 9VNfVjhbjjb7w636LeyS+0pgsp6ace2YMUQvXZvyTyfBvEjKXzBa/+StJkIgXFp2J8IQKq2i 8kxMVKDaDzNZhhGPEoaDNQinOCti1HydzlZsE6Uruww+We7IAlZgOi1YICEJI3iqcN9lGnH5 W/M/zrFGS4dFvW4jhaU+E2Gr7qa9c/8cNlPTeXnp6ACbEeo7mcaBBpTU1WgieKoj1a3HdNZM U0dvCQ0xZXe72SuR9j5Ghm2+TuK40JaVN1XHOk3rgqKz8I4/jp1GEBVdBtvMcEcv/YJTBd3i na2hPXsHj5G5ej9pW2myp+Yqja7OC4wJGAEZDMZQQZt3zUFiNFs5v4oZoY8eJNZnuHI9SfML ydmRRXSap0WhM8PkqG1pBXJ3m7qqZ/OQQo4oA7QWwpJDz+Vhqb0N+RECnCCsp6sybp1qHHf5 hDofODAt4gz4WmlznDlfQn0NOjBCwy5GDPdm0VzOJIq6i6g/XWuFagJv2ggexowaJxcIGSyC KM2he+3zM8MVJdNRf8nC79d9+xzksAM6Py8Cq6LNIMUMvCdiifcp3EwNCZ8IFwBYGB3zPhnZ sbFGSpdJXkdDqtjwXK3QexbuYLHNQhvrV4/savTlkz9uZLHPCD9Ye5cbDOmMLplhIva+1692 4gEaKO3J+B3C7GWjt//qtNLcTjn7BETWfjLliCgXrXdc1s/Rjp6WqW5LHFIU9UNopm5X9zgp hmVMnK0AnKg7ZEeAVTVOiJQe/n0UIxhrHk2GyUpMBz6kzIgeIujpuNXPZc+YbBtpqQpwO9WX ss1XZyKIs1ObTDbpBUbT53297J5eDqR2AmhAiuCYRoERaBGeTDnwNHfUzHUxHE8NRbv7coai J+85zzfWqsGFlhDDt6JSfeBzGGRnHk6mcBqVU3qOuhoJUDn8alxCinLlvRsCdo9GRbC4Tq71 gitHhYTo9fWkbI16NXkgaOlrZ+jNulPAXphAGjQ6IipORng/maMxZFKVMCKd2v/UFzY1bqDZ +IP6d3BK9wCwUh3trRjH4ZRza4R48Xlo5lYxF9GGFTJd1GaNaNyEEKZ3MVgtrx/+ZEBgFGYA nmww9h9PamFHOjHE1RLfQosUbml5MEuwzLX6aw4HVX+6CpJ54G4aER1PSSXqSljPbBwYZIEw +AghZYs0DaBqCEWa/SIsiMF0F63DC0kc74mvZQkEoPUmlIV6lVdU6f9VA7ywr+yMutpDGd7D AOQtqT4g5Zk+nHjaFs2THjE4vpcj88BuTdM114zGG6Kkdvk2N4y4gZb3jAoQitRzDFCgvNBK 0lwFkhPPa7V1SxZtMtCeGGNGg96GxyS/HLq+WYJjGH0S0qJVHTHCW8MJta240ES9lxDcghh/ L23zHjvVRDodprT2hQedFFEqfu5a/BM7SzHxd6aGvqaE6kAYTbKhrGkYUwKoUDFBeIznEj2m vl4zt1vaKHUNT8inINjMtO0jY8vcRGjIHBOZdpD/6lTRGHVR2yU6Ai0cku0fptAGuzO/UqGE PdRH8NoVSml9SOwvzseVL8tIbh1oaYT3+A8WIjXfEwIj7jOiQBSksP01jP/j2oVUdlRgZ4DC ofORQmjTE2Up1VpwlHokucVG1CFceElZRL91t+b6O8mNYwOm8AyfFAQ0ombhWS0Mgxm9S26p AnoPrTkzcF++4ZRx6/pHrtJXQmvG+iuVu7SqAGXmPZNZOPpLs3hmV40qF7mHgIOJpoXeY19u oqsue7N/nHunegJQUXGvZieBo915cmWd8hGAPLddXV1s3OLZ57x3kEl5Wu9F61srPpcwcuWH y2Dd8q6cI8uaed3nXF6RXBXLEcANv7RcKzlmCKaqsaMADg71SjsDouu1V3tXFFhWh44Ab/MI S6qhK/2/fFdlppGOzEcDfI/A5NYHk7qaZF7S/LP7wunHkuaqXLcnID9lCgQyyDBUViFN8fY3 ajrZDbDcDaKhaWZ6+0B7qJTuEQbAk8o1KN0NggY9sVtgj+3MH8eIK5Ped8aA5VTiWrp2Iu+e DjJa3A4BD7gWShfNy/x+8nnQhzVE9lm1g0V/dD11xj8h+aK6IK87H9J8y5h5zJydGKmwrz7b 94Z/XL0M166xZQBqSP/IBCkqb8P+x8Y7itgFYPBfwjaDhIaAK0W1XcnGxBCPcACO9+Yj13Ff ADZWkgdKHxWiifN/QJIenhPGRhfsimHI/DEq8uQ6I63hrh3B9GsBBEy1y8fH1HDgAk3yGYye E7K
  • Ironport-hdrordr: A9a23:rObaYax0QSAsqy0rVaSMKrPw+r1zdoMgy1knxilNoNJuA7Glfq GV7YgmPHDP+VAssR0b6Le90ey7MBfhHP1OkOss1NWZLXHbUQKTRekIjOvfKn/bakjDH4Zmu5 uIGJIOc+EYY2IK9/oSIzPUL/8QhPeC+KCswcHEz3lsSgluL4Vt9R1wBAreMmAefng+OXP6Lv ChDwZ8ywZIsE55H6eGOkU=
  • Ironport-phdr: A9a23:hfkNrRRPvEitLZejx0kEF1piIdpsovOWAWYlg6HPa5pwe6iut67vI FbYra00ygOTDMOBtK4P27OempujcFJDyK7CikxKSIZLWR4BhJdetC0bK+nBN3fGKuX3ZTcxB sVIWQwt1Xi6NU9IBJS2PAWK8TW94jEIBxrwKxd+KPjrFY7OlcS30P2594HObwlSizexf7d/I A+ooQjTucQajohvJrsswRbVv3VEfPhbymxvKV+PhRjw4du+8oB++CpKofIh8MBAUaT+f6smS LFTESorPWMo6sD1rBfPVQSA6GcSXWUQiRpIHhPK7ArmUZfrsyv1rfRy1S+HNsDrV780WDCi7 6B2SB/0jSoMKjA0/H3LhsF2kalWuwyqqQBhzIHIYYGVLPt+cb3bfdMGXmpKQ8JdWzVcDo+gc 4cDCuwMNvtaoYbgvVsDtRSxBQeuC+3vxDFGhWL407Mk3us9CgzJxhAsEsgUvXjIsNn4NqEfW v21wqnSyjXDautb1zj46IjNaB8hp+yMUqx0ccHM10cvFwTFjlCKrof4OD6b2eENs22B4OpmS OKglWonpxtvrDivwccsj5fGi5kUylDC7yV5wZw6Kce2SE5heNOpFoZbuC6GOYVsWMwiX31ot zggyr0AoZO2fSgHxZUnyRPcavGKc4eG7xD/WeuSPDp1inxrdbK/iRuu7ESty/HwWMe03VtIs idIk9jBu3MN2RHO6sWKSvtw80G80jiB0ADT7/tLIUEylafDMJEgzKI/moALsUTHGCL9hUb4j LeOe0k65uSl7/7rb7bmq5OGKoN5iwPzPr4ul8GwGeg0LBQCUmyB9em/1LDv51D1TbRIg/Esk qTVrpbXLtkFq669Hg9VyZoj5AilDzm70dQZnGcIIUpZdB+BkoPnIUvBIOriAve6m1mskClkx /TBPrD5B5XCNHnDkLP4cbZ55U5Q1RM/zd5f6p9bEL0BL/XzWkj+tNzcEBA1KRC7w+HiCNll1 4MeX3yAArOBPa/MrVOF4vgjLuqMaYMPpTrwKvoo6+TugHI2gVMdeLOm3ZoTaHC2BPRmJECZb GLyjdcEFGcFpAw+TPH0h12GSzJTe3ayX6U55j4lE4+pE4PDRoSsgLyZwCe7H4FZanpBClCWH nfkb4OEVOwUZyKIPsBhiiAEVaSmS4I5yB2irBX6xKZ/LurI5i0Ysoru28Rv6OLOkhE+7Dp0D 8qG02yWVGF1nmYIRycs06xloEx9zE2D0alijPBCG9xT/aABbgBvCY/YwvYyIc3pUBOJKsuYT Fu8SMi9CyBqFYkZzNoHYkI7ENKn2EPtxS2vVpIQkrWNGZEwup7c2H1wb5JQznPP0rQsix8aQ sxKOEWngLR+/k7dHdiawA2ii6+2ePFEj2b2/2CZwD/L5RkAOOYReaDMXHREI1DTscy8/UTaC bmnFbUgNAJFj8+EMKpDLNPz3h1dXPm2HtPYbiqqnnuoQw6Sz+ahaY/jfXgQ2GPhCE8BnigS+ 2yHPE4wHHTpuHrQWQRnDkmneEbw6a97oXK/QFUzylSHakho0ae4/lgJhfuZRtse2rYFoDsgo HNzBlnul8nOBY+moAxsNL5Zfct74FpD0jfBsBdhO5W7M614rlsXcgAyu023khsqVsNPls8lq H5sxw13QU6B+HVGcT7QnZX5O7mMb3L34AjqcKnOnFfXzNeR/K4LrvU+sVTq+g+zRAIk9D183 t9Z3mH5hN2CBRcOUZ/3Tkc89gRr77DcbC4n4orI1HpqeaCquz7G0tgtCaMr0BGlN9tYNaqFE kf1Hah4T4CDLekjnUWkalQ4POdX+YY1Odmnfr2Iwu/jPeptmi6nkXUS+Jp0gSfuv2J3Tu/F2 YpAwunNhFPWEWehyg7/7YaqxdMhB3laBGe0xCn6CZQEY6Rzed1OEmKyO4iswc04gZfxWnle/ VrlBlUc2cbvdwDBCj61lQBWy0kTpmSq3CWiyDkh2Twsq6OZwSHKhfjicx8KEmNPQGh6kVbnZ 4+ugJpJOSrgJxhsjxaj6UvgkuJ2oaF2KXTeRQF0fy/zIkljVLGxv/yMeYQcjfFg+TUSW+O6b 1eAT7f7qBZPyCLvEVxVwzUjfi2rsJH09/BjoFqUN2077H/QeMUrgAza+MSZXvlJmDwPWCh/j zDTQFm6Jdigu9uOxd/PteW3VmTpUZM2E2GjwIaFvS2m6GksHR24mv2bl9njEBAm0CS90MNlH SnFtxfzZID32r/ya7o2OBk1QgaitIwnRsl3ie5SzNkI1GIfh4mJ8HZPimr1PdhBmOr/YHcLW T8X0ovQ6Qnh1ldkKyHBzIb4W3OBh8p5MoPiMyVGh2RksZ8MVfvHid4M1TF4qVe5sw/LNP10n zNGjOAr9GZfmOYR/gwk0iSaBLkWW0heJy3l0RqSvLXc5O1aYniidb+o2Q9wh9ekWfuPpwRVX mv0ftE+HSh14+1yNFbNyGHy4cfjYtaaPrdx/lWE1gzNieRYMsd7m/8PjCx6NGa7oX0izuMTg Bhq3ImmvYPBLH9itvHcYFYQJnj+YMUd/SvohKBVk5ON3oyhKZ5mHy0CQJriSf/7WCJXr/nsM ByCVSEtsnrOU6SKBheRsQ01yhCHW4DuLXycI2MViMlvVAXIblIKmxgaBX07hsJrTV3snZ25N h0luXZJoQSk4hpUlrA2a1+lCTyZ/VnwLG9zEcn6TlIe7xketRmLd5XGtKQrWXkfpMXprRTRe DLFIV4UXCdZAgrcQAq7dri2uYufqa7BWqznfqGIOfLX+Yk8H7+J3c79jdcgpmzRcJ3JZj47U bU6whYRBCgpXZ2GxHNfDXRQzXuFbtbH9k7joWsu/5z5qLKzH1uxgOnHQ7pKbYc1o0Hw0frFb rTAwn4+cGkQ14tQlyWRlv5CjBhL2nsoL370QPwBrXKfFvuO3PUMXlhAMWUrc5IZisB0lh9EP cqR4j/s/ph/iPN9S1JMVFi639qsedRPOGalclXOGEeMMr2CYzzN2cD+J62mG/VWi61Puhu8t Cz+cQerNymflzTvSxGkMP1dxCCdMhtEvYihcxFrQWH9RdPiYxe/PZd5lzozibEzg3rLMyYbP 10eOwtVqaaM6CpDnvhlM2lI734gKefd3ijEs6/XLZEZtfYtCSNx1qpb7Hk817pJ/XRESfhyy 06w5pZlp1CrlPXKyyIyCkIf7GYWwtvV4gM/Zfa8lNEIQ3vP8RMT4H/FDh0Lo4AgEdjzo+VLz cCJkqvvKTBE+taS/M0GBsGSJtjUVRhpeRfvBjPQCxMICDCxMmSKzUFYlfiU63CR6IQ6rZ3hs JcKTr5DSF0+UPQABQ42eb5KaIcyRT4inbOB2YQQ4mGiqRDKWMhAlpXOV/bXBfu2bTjE0f9LY BwHxb6+JoMWfN6euQQqehxxm4LEHFDVVNZGr3h6bwM6l05K9WB3Umw530+NguKF5XYSEbuwk EdzhFYiJ+sq8zjo7hE8IV+Y/EPYf2E8ndHkmSyQe3j2NqjiBOm+7gL1sVIwO9X1WVQtBTA=
  • Ironport-sdr: 6654a0e4_nHJPUOZmOQuEKhF26bs7M8wmUvkx/FHEx0LH21G9MvqAa29 Dzu6ZPUOW/AhG9vCxtyokDAaxAbvfLqdluab7Sg==

Hi,

Thanks for your report.

Yes, that's weird.

Do you still have the files  /home/tbusby/cado-nfs/v2.-1.915/v2.-1.915.bwc.stderr.1 and  /home/tbusby/cado-nfs/v2.-1.915/v2.-1.915.bwc.stdout.1 ?

Also, what is the git revision that you used in your computation ?

Best,

E.

On Mon, May 27, 2024 at 4:23 PM Tyler Busby <tylerbusby333@gmail.com> wrote:
I've been having issues with the CADO-NFS linear algebra step recently. It will seemingly die at a random point (but usually once every time I run an SNFS job), but be recoverable and complete LA if restarted again, here's the most recent death:

Info:Filtering - Merging: Merged matrix has 2336636 rows and total weight 284561353 (121.8 entries per row on average)
Info:Complete Factorization / Discrete logarithm: Filtering - Merging
Info:Filtering - Merging: Total cpu/real time for merge: 457.83/67.7782
Info:Filtering - Merging: Total cpu/real time for replay: 68.71/63.6312
Info:Linear Algebra: Starting
Info:Linear Algebra: krylov: N=3000 ; ETA (N=75000): Sun May 26 06:55:09 2024 [0.069 s/iter]
Info:Linear Algebra: krylov: N=6000 ; ETA (N=75000): Sun May 26 06:46:58 2024 [0.062 s/iter]
Info:Linear Algebra: krylov: N=9000 ; ETA (N=75000): Sun May 26 06:41:30 2024 [0.058 s/iter]
Info:Linear Algebra: krylov: N=12000 ; ETA (N=75000): Sun May 26 06:48:51 2024 [0.064 s/iter]
Info:Linear Algebra: krylov: N=15000 ; ETA (N=75000): Sun May 26 07:01:52 2024 [0.074 s/iter]
Info:Linear Algebra: krylov: N=18000 ; ETA (N=75000): Sun May 26 07:10:54 2024 [0.081 s/iter]
Info:Linear Algebra: krylov: N=21000 ; ETA (N=75000): Sun May 26 07:17:03 2024 [0.086 s/iter]
Info:Linear Algebra: krylov: N=24000 ; ETA (N=75000): Sun May 26 07:21:48 2024 [0.090 s/iter]
Info:Linear Algebra: krylov: N=27000 ; ETA (N=75000): Sun May 26 07:25:17 2024 [0.093 s/iter]
Info:Linear Algebra: krylov: N=30000 ; ETA (N=75000): Sun May 26 07:28:09 2024 [0.095 s/iter]
Info:Linear Algebra: krylov: N=33000 ; ETA (N=75000): Sun May 26 07:30:33 2024 [0.097 s/iter]
Warning:Command: Process with PID 304895 finished with return code 1
Error:Linear Algebra: Program run on server failed with exit code 1
Error:Linear Algebra: Command line was: /home/tbusby/cado-nfs/build/bubtop/linalg/bwc/bwc.pl :complete 'thr=16' 'm=64' 'n=64' 'nullspace=left' 'interval=3000' 'matrix=/home/tbusby/cado-nfs/v2.-1.915/v2.-1.915.sparse.bin' 'wdir=/home/tbusby/cado-nfs/v2.-1.915/v2.-1.915.bwc' 'interleaving=0' 'cpubinding=/home/tbusby/cado-nfs/parameters/misc/cpubinding.conf' 2> /home/tbusby/cado-nfs/v2.-1.915/v2.-1.915.bwc.stderr.1
Error:Linear Algebra: Stderr output (last 10 lines only) follow (stored in file /home/tbusby/cado-nfs/v2.-1.915/v2.-1.915.bwc.stderr.1):
Error:Linear Algebra:   #############################################################################
Error:Linear Algebra:   /home/tbusby/cado-nfs/build/bubtop/linalg/bwc/krylov cpubinding=/home/tbusby/cado-nfs/parameters/misc/cpubinding.conf interleaving=0 n=64 nullspace=left interval=3000 wdir=/home/tbusby/cado-nfs/v2.-1.915/v2.-1.915.bwc matrix=/home/tbusby/cado-nfs/v2.-1.915/v2.-1.915.sparse.bin prime=2 thr=4x4 m=64 ys=0..64 start=0
Error:Linear Algebra:   # (270f78085) /home/tbusby/cado-nfs/build/bubtop/linalg/bwc/krylov cpubinding=/home/tbusby/cado-nfs/parameters/misc/cpubinding.conf interleaving=0 n=64 nullspace=left interval=3000 wdir=/home/tbusby/cado-nfs/v2.-1.915/v2.-1.915.bwc matrix=/home/tbusby/cado-nfs/v2.-1.915/v2.-1.915.sparse.bin prime=2 thr=4x4 m=64 ys=0..64 start=0
Error:Linear Algebra:   # Compiled with gcc 11.4.0
Error:Linear Algebra:   # Compilation flags (C) -std=c99 -W -Wall -O2  -msse3 -mssse3 -msse4.1 -mpopcnt -mavx -mavx2 -mpclmul
Error:Linear Algebra:   # Compilation flags (C++) -export-dynamic -std=c++17 -Wno-c++11-compat -W -Wall -O2  -Wno-literal-suffix -msse3 -mssse3 -msse4.1 -mpopcnt -mavx -mavx2 -mpclmul
Error:Linear Algebra:   /home/tbusby/cado-nfs/build/bubtop/linalg/bwc/krylov: exited with status 1
Error:Linear Algebra:   aborted on subprogram error at /home/tbusby/cado-nfs/build/bubtop/linalg/bwc/bwc.pl line 529.
Error:Linear Algebra:           ...propagated at /home/tbusby/cado-nfs/build/bubtop/linalg/bwc/bwc.pl line 1616.
Error:Linear Algebra:

It seems bwc.pl doesn't account for the possibility of this happening, and it just kills the overall script, and nothing informative seems to be output to any of the logs. Does anybody know what might be causing this? Perhaps an automatic retry or more informative logging needs to be added in this case.

-Tyler



Archive powered by MHonArc 2.6.19+.

Top of Page