Skip to Content.
Sympa Menu

cado-nfs - Re: [Cado-nfs-discuss] Statistic Visualization

Subject: Discussion related to cado-nfs

List archive

Re: [Cado-nfs-discuss] Statistic Visualization


Chronological Thread 
  • From: Emmanuel Thomé <Emmanuel.Thome@inria.fr>
  • To: cado-nfs-discuss@lists.gforge.inria.fr
  • Subject: Re: [Cado-nfs-discuss] Statistic Visualization
  • Date: Tue, 3 Jan 2017 09:58:06 +0100
  • List-archive: <http://lists.gforge.inria.fr/pipermail/cado-nfs-discuss/>
  • List-id: A discussion list for Cado-NFS <cado-nfs-discuss.lists.gforge.inria.fr>

Hi,

Nice!

>From a cado-nfs standpoint, we are more interested in doing the following
things:
1 - think about which data needs to be extracted,
2 - how to actually extract it,
3 - and present it nicely.

Item 3 above is less important than the two previous. And eye-candy is
probably the least important thing.

In my experience when doing large computations, I had some interest in
doing the following things:

- monitor, over time, the number of resubmitted WUs (because some jobs
got killed without notice. This happens when we use idle time on
otherwise busy HPC resources).
- monitor, over time, the number of WUs processed, and relations
obtained, by each "group" of machines. I've run computations spanning
several clusters which come on and off. Get an idea of the average
yield of some given "group", say over a week or so.
- order nodes (or groups of nodes) by last completed WUs. This is
potentially very useful to detect when the connection to some given
set of nodes breaks.
- accumulate some definition of CPU time spent -- which is a bit tricky
when nodes are oversubscribed. Somehow the WU point of view does not
have enough info to compute that.

It is also important that this data cat be extracted regardless of which
database back-end is used (sqlite or mysql).

Above, the notions of "groups" of machines call for some external data.
In our latest hsnfs1024 computation, I defined a few auxiliary tables to
provide this data, e.g.

mysql> select * from clusters limit 3;
+-----------+---------------------+-------+--------+------+-------+------------+
| cluster | site | cores | vcores | ram | njobs |
jobthreads |
+-----------+---------------------+-------+--------+------+-------+------------+
| cluster1 | siteA | 16 | 32 | 64 | 10 |
4 |
| cluster2 | siteB | 8 | 8 | 16 | 2 |
4 |
| cluster3 | siteB | 16 | 16 | 128 | 16 |
1 |
+-----------+---------------------+-------+--------+------+-------+------------+
3 rows in set (0.00 sec)
mysql> select * from clients limit 3;
+-------------------------+-----------+
| clientid | cluster |
+-------------------------+-----------+
| cluster1-00.loria.fr+1 | cluster1 |
| cluster1-00.loria.fr+10 | cluster1 |
| cluster1-00.loria.fr+2 | cluster1 |
+-------------------------+-----------+
3 rows in set (0.00 sec)


Then I had some expensive SQL queries which were used to walk the
complete workunits table and come up with statistics. Here counted by
total number of WUs completed.

mysql> select site,sum(wus) as score from (select cluster,count(*) as wus
from (select * from workunits where status=5) x left join clients on
x.assignedclient=clients.clientid group by cluster) a left join clusters
using (cluster) group by site order by score;
+---------------------+--------+
| site | score |
+---------------------+--------+
| siteA | 77823 |
| siteB | 98326 |
| siteC | 153184 |
| siteD | 167058 |
| siteE | 387611 |
| siteF | 523427 |
+---------------------+--------+
6 rows in set (1 min 14.28 sec)

Also, queries such as "what happened recently ?" go by:

mysql> select cluster,count(*),wurowid from (select * from workunits where
status=1 AND unix_timestamp(timeassigned)>=unix_timestamp(now())-14400) x
left join clients on x.assignedclient=clients.clientid group by cluster ;

(here, status 1 selects ASSIGNED WUs).

Overall, this leads to a cookbook of sql queries I ran quite frequently.


A web-accessible frontend could make such queries for the user, maybe.
We should make sure the user doesn't inadvertently (or maliciously) DOS
the system by running expensive queries, though.

Maybe some design choices in the server databases layout are to be blamed
for the fact that some queries take long; and improvements could be
needed.


How this could all be presented is a matter of taste (I'm fine with text
tables, and unfancy plots for data which is to be plotted over a time
range).


I'm somewhat undecided as to what would be best for how to do such
queries. A full-blown apache with php enabled is not really to my taste
for achieving this. A homemade server doing this heavylifting would bring
more flexibility (and avoid having one extra language), but would perhaps
not be ideal either: we should make sure we don't expose it to the
outside world. I think that embedding this functionality in the main
server as it exists now should be out of question. It should be two
separate processes.


E.

On Sun, Jan 01, 2017 at 10:26:56PM +0000, Grüninger, Micha wrote:
> Hey guys,
>
> I have written some scripts to visualize progress and participation. Of
> course, the scripts are not perfect, but they work for me yet, and I think
> they are a good start. If you want to see it in action visit:
> http://149.201.240.24/ . Since you shared your cool program, I think it is
> just fair to share my work. It consists of these three files:
>
> * data_eta.php: Reads the log, extracts and process information about
> the ETA. Saves it to a file that needs to be shared by a webserver.
> * data_list_wus.php: Extract information from a SQLite3 DB, process it
> and saves it to a file that needs to be shared by a webserver.
> * index.html: Takes to output file of the 2 previous scripts and
> visualize them. Needs to be shared by a webserver. This is the file the
> user sees.
>
> Some words about the example page:
> We are a group of students at FH-Aachen, and we are factoring a 200-digit
> number. We computed the polynomal selection with the GPU-version of Msieve,
> and are now using cadonfs for the second Step. Because our server crashed
> several times we have slumps in the ETA at the 23 and 29 December.
>
> I see that it is not optimal that I used PHP to code the data generation
> scripts, because users must install another interpreter to use them. Feel
> free to convert them to another script language, if that bothers you.
>
> I attached the files.
>
> Greetings
>
> Micha Grüninger
>
> Team Rocket



>

> _______________________________________________
> Cado-nfs-discuss mailing list
> Cado-nfs-discuss@lists.gforge.inria.fr
> http://lists.gforge.inria.fr/mailman/listinfo/cado-nfs-discuss





Archive powered by MHonArc 2.6.19+.

Top of Page