cado-nfs - Re: [Cado-nfs-discuss] interleaving queries

Subject: Discussion related to cado-nfs

List archive

Re: [Cado-nfs-discuss] interleaving queries

From: Junyi <9jhzguy@gmail.com>
To: Emmanuel Thomé <emmanuel.thome@gmail.com>
Cc: "cado-nfs-discuss@lists.gforge.inria.fr" <cado-nfs-discuss@lists.gforge.inria.fr>
Subject: Re: [Cado-nfs-discuss] interleaving queries
Date: Fri, 12 Apr 2013 15:56:24 +0800
List-archive: <http://lists.gforge.inria.fr/pipermail/cado-nfs-discuss>
List-id: A discussion list for Cado-NFS <cado-nfs-discuss.lists.gforge.inria.fr>

Interleaving works for the git version works, thanks! Should have tried that before firing off the questions.

"Scheduler is at work then, and timings don't really mean much in the end" -> Not quite sure what this means?

Each of the nodes have 64 cores, and it does tax them quite a bit indeed (CPU utilization is at 9000%). This was done as we observed that running mpi=4x4 and mpi=6x6 yielded a longer ETA compared to 8x8, and it was set to 8x8 because we wanted to minimise the runtime. Is there some other issue we overlooked that is related to the scheduler? (e.g. time we're basing our decision on isn't real-time but user-time, etc?)

Some other random observations and issues:

Dispatch stage fires off many error/info message during the local matrix construction: e.g. Lsl0 cols 161739+8175Ss13: cols 97043+32348... [cannot be small2, beyond impl limits]. This does not stop krylov from running though.

Also, performance at the krylov stage wasn't quite what I expected compared to the non-interleaved version (about 6s/iter vs 2s/iter), will play with the parameters for now.

Incidentally, krylov-master doesn't seem to like krylov-1.1's balancing info even with the same params, and will restart matrix dispatching. krylov-1.1 will restart from 0th iter thus killing previous progress, in case anyone's thinking of trying that.

Hoping to restart using a better code-base, I re-ran bwc.pl on 2 nodes with IB using cado-master on the rsa640 matrix without interleaving, but was surprised that it seg-faulted after secure finished ("...saved C.1000") but before split started. Message was: "mpiexec noticed that process rank 47 with PID 34600 on node t002 exited on signal 11 (Segmentation fault).", bwc.pl was running on the other node (t001). Not sure if this is one off (I previously ran bwc.pl on rsa640 for mpi=4x4, 6x6, 8x8 with interleaving with no issues), or something systemic.

Terminating krylov-1.1 after a checkpoint and restarting usually reduces the time per iteration by about 10% (i.e. 2.29s/iter -> 1.85s/iter). This has happened for both the times I started krylov from scratch. Not sure if this indicates a potential hardware (cache?) or software (memleak?) issue, or just simply an averaging issue (i.e. more time spent earlier than later), or just old code.

Much thanks for your time and assistance!

On Thu, Apr 11, 2013 at 4:32 PM, Emmanuel Thomé <emmanuel.thome@gmail.com> wrote:

Please use the git version. IIRC at least the ``last split do not
coincide'' rings a bell w.r.t a bug I fixed a few months ago.

Your "Segmentation Fault in the dispatch stage" calls for more
information. The tail of the log file would help.

Be aware that mpi=8x8 thr=2x2 hosts=n001,n002 means that you are going
to start 32 jobs with 4 threads each on the two machines. I don't know
your hahrdware specifics, but I imagine that you may be boldly
overloading your cpus then. Scheduler is at work then, and timings
don't really mean much in the end.

Tell me what you get with the git version. Don't hesitate to post
(compressed) log files, this does help getting an idea of what
happens.

Best,

E.

On Thu, Apr 11, 2013 at 9:50 AM, Junyi <9jhzguy@gmail.com> wrote:
> Apologies for the terribly unhelpful error debug request previously, and
> appreciate the real speedy response; I just wanted to make sure I got the
> syntax right.
>
> I started with somthing like /bwc.pl :complete matrix=rsa640.bin
> nullspace=left wdir=lustre/bwc.split thr=2x2 mpi=8x8 mn=64 ys=0..64
> interleaving=0 hosts=n001,n002 interval=
> 100. This has no issue.
> However, the COMMS time was taking 3x longer than the CPU (about 4s / iter,
> still on 10GbE unfortunately) so I wanted to include interleaving as an
> option.
>
> I cleaned the wdir, and ran the same command, but with interleaving=1. This
> resulted in a Segmentation Fault in the dispatch stage. README indicates
> that I should change ys such that each krylov instance gets 64 bit width,
> and use the splits parameter accordingly.
>
> I then ran it with interleaving=1, ys=0..128, splits=0,64,128, and it ran
> well until split, which indicated that "last split do not coincide with
> configured n".
>
> Checking the source, I understood it as having to set n=128, while leaving
> m=64. This reaches the krylov stage, but throws an "[err]
> event_queue_remove: 0x26b2bd0 (fd 11) not on queue 8" several times before
> exiting.
>
> Following your suggested syntax for interleaving, I ran it as such: bwc.pl
> :complete matrix=rsa640.bin nullspace=left wdir=lustre/bwc.split thr=2x2
> mpi=8x8 mn=128 ys=0..128 interleaving=0 hosts=n001,n002 interval=
> 100. However, this meets an untimely death at the secure stage, with
> message: abase_u64kl_set_ui_at: Assertion 'k < 64' failed.
>
> ---
>
> Openmpi is 1.5.4, version of cado used is 1.1, not the latest release via
> git. non-interleaved krylov currently appears to be churning along happily
> on a two-node 64-core infiniband testbed (2.04s / iter, N = 1197000). Size
> of matrix appears to be around 38M x 38M, based on the merge.log.
>
>
> On Wed, Apr 10, 2013 at 8:43 PM, Emmanuel Thomé <emmanuel.thome@gmail.com>
> wrote:
>>
>> FYI, here is a command line which successfully computes a kernel using
>> bwc's interleaving:
>>
>> ./linalg/bwc/../.././build/fondue.mpi/linalg/bwc/bwc.pl :complete
>> matrix=/local/rsa768/mats/rsa100.bin nullspace=left wdir=/tmp/bwc
>> thr=2x2 mpi=2x2 mn=128 ys=0..128 interleaving=1
>> hosts=fondue,raclette,tartiflette,berthoud mpi_extra_args="--mca
>> btl_tcp_if_exclude lo,virbr0" interval=200
>>
>> (mpi is openmpi 1.6.1 here).
>>
>> E.
>>
>> On Wed, Apr 10, 2013 at 1:07 PM, Emmanuel Thomé
>> <emmanuel.thome@gmail.com> wrote:
>> > Can you expand on "crashes out" ?
>> >
>> > E.
>> >
>> > On Wed, Apr 10, 2013 at 12:35 PM, Junyi <a0032547@nus.edu.sg> wrote:
>> >> I'm running the cado-1.1-released version on a cluster, and have been
>> >> trying
>> >> to enable interleaving at the krylov stage to mitigate the high comms
>> >> overhead.
>> >>
>> >> Interleaving = 0, mn = 64, ys=0..64, splits=0,64 currently runs well,
>> >> but
>> >> Interleaving = 1, m = 64, n = 128, ys=0..128, splits=0,64,128 crashes
>> >> out
>> >>
>> >> have i misused the parameters? please assist, thanks!
>> >>
>> >>
>> >>
>> >>
>> >> _______________________________________________
>> >> Cado-nfs-discuss mailing list
>> >> Cado-nfs-discuss@lists.gforge.inria.fr
>> >> http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/cado-nfs-discuss
>> >>
>>
>> _______________________________________________
>> Cado-nfs-discuss mailing list
>> Cado-nfs-discuss@lists.gforge.inria.fr
>> http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/cado-nfs-discuss
>
>

[Cado-nfs-discuss] interleaving queries, Junyi, 04/10/2013
- Re: [Cado-nfs-discuss] interleaving queries, Emmanuel Thomé, 04/10/2013
  - Re: [Cado-nfs-discuss] interleaving queries, Emmanuel Thomé, 04/10/2013
- <Possible follow-up(s)>
- Re: [Cado-nfs-discuss] interleaving queries, Junyi, 04/11/2013
  - Re: [Cado-nfs-discuss] interleaving queries, Emmanuel Thomé, 04/11/2013
    - Re: [Cado-nfs-discuss] interleaving queries, Junyi, 04/12/2013
      - Re: [Cado-nfs-discuss] interleaving queries, Emmanuel Thomé, 04/12/2013

List archive

Re: [Cado-nfs-discuss] interleaving queries