coq-club AT inria.fr
Subject: The Coq mailing list
List archive
- From: Théo Zimmermann <theo.zimmi AT gmail.com>
- To: coq-club AT inria.fr, dholland-coq AT netbsd.org
- Subject: Re: [Coq-Club] strings.
- Date: Thu, 8 Nov 2018 14:03:38 +0100
- Authentication-results: mail3-smtp-sop.national.inria.fr; spf=None smtp.pra=theo.zimmi AT gmail.com; spf=Pass smtp.mailfrom=theo.zimmi AT gmail.com; spf=None smtp.helo=postmaster AT mail-wr1-f67.google.com
- Autocrypt: addr=theo.zimmi AT gmail.com; prefer-encrypt=mutual; keydata= xsFNBFP+V00BEADi9eGxuXfvs0HxaYDxrOBpCJWakA4+lCi3KUGqNS5Xq0V7q+BWZZmF/pOO x4EemIickFV+DA/Yp9542XWw1vMK3uGAUpnB4sUvD+qPDnRFHDZH60p1b8rRzukHRdukW33g QsEjDdNp9iVebuBMe+IU2gdpdZxjgxzxShQMIHOX1Az65oC81eEV37XJ7b89WjArEEe0Hxcg 9+jFH3P3pNdlVaxMgEaYxeepTEfBIhWJjobr299gkgUm2a79vYnci0VONNk3crZqpNvrDK9h bqBgsEUv7t8JnIDQOJTncuoeZA+YFkceYKysO4tyDgNPgXSuI8tUfo7eU1uacNvX7LoKH1nT d4Gr26nPUyj8ntnvnT88LK+CxrrfYkUxtK5pYIQLzJSnCaCuoxOu4WJn+6PPtMhUNWC50fmA +HBt3v8YhheNEEbBZPqEGE9O1A0xB5oJJ3UCsf2VpxiEc3qCTzZHPbw5hYsS96MRysJ3gl9Z OG/u8M5r/JlyiDbVOj6WEcjgMYhuowbEX45UCCufgMWlVFDqKj685zCwhPxYSKvW2+Pkvw1K YmM4tQuF3bH2oQXe8W791N39JXQU/LHD7EHnmoJW/6ySc7Mv2f9cId/HY/DeM1U1TTTEEr1Q j6o/6PdRHGWT5PZpbNDy9+sdBieF1JWuXx24KHbhbN/OCebiewARAQABzSdUaMOpbyBaaW1t ZXJtYW5uIDx0aGVvLnppbW1pQGdtYWlsLmNvbT7CwX0EEwEIACcFAli5dBUCGyMFCQlmAYAF CwkIBwIGFQgJCgsCBBYCAwECHgECF4AACgkQ8XRKCUL1Nsf97A//QpWey31m7X20iHRRO833 BPyziP9QbIQveB8BofZfjWXBJoGExP2EZdb10UMwU0tsnD6NT5E8nKr9rRRTtnwXOcqaQHGd Y4mVLEyJYwZ8g4uZglMyTCHx6k064Ik4c5DysS/15AB72SZm+fJFayIzSY1syvTnTPxTnNx7 l697IbjS3aq3/W+vQ6MwnDgnQqggGMiA0+Tmp5VPCZX9deLmXWC1zeXe/PdCstPzeHTiIHgD wQW+0edCBELbJdhoVBKsQ3OfBK7uMJBXug7zqI3wAoywvd/E12tFzZJhLZiq8arDIimeccf4 bFZUIy17ApiFfAZJU5Rm71H29IFFMTyQxi4QFFiPjWKwtUTDB/Xv6V8nGgdhOHkNgjnRpPn4 jNxSnvr3/f1OW2yGFYyxTUlmYg72eTkSNlPAwd+WA5K3aQQtwUERzhPsK2l4Smjqr7NcI+5X 0aHBVRyRGrTmtTRz/hdyoGoxx4ts1TacquKHx88YH9GZhEng9h+Ak70oQYD9u+atp93wTyTo tghwB5Ewm3O9oruIdQtWx2H4PtP4Y3V1O/DjK42D3wKwikiHisCTKtbuIsgLTnxebWC7wN0q X9nln3KJkEnxZ/6pjLQ3XZB5YeUq58bBnr06VTZYjjYM5aZ6cQrOrnWXvrsrTh367cPKesk5 G5h/itHB0UQc+eXOwU0EU/5XTQEQAL7skL+t5KvE/j2zAzuUFtwjIA+GPBcwOJnITdT3fddN S18gCBx+EH/YPbCphKXOxRFeaD+EDuC1Ls5XK+lYu9hR7rurW7RQI+1XKO7KHQvp4i2I8J4C C+kmMw9K7fysHFG9ckiw1G5WrwiiPZHLadRIAfXwMLoDfRRNu/W8ogjm1m+UWm88DISrkIbE BYEw/fWX8QGGcFvExrFAgXyXXn4vVG90XNre+wJwIADcCqQJXCfzNF/tyetcGEwc3BZiUe/Q Ha/3xLDSm19rDkVWSpWGUr0uSziPSR9kYw6q4F1gKdvex6hnduRMEJLwFRfiHKnHZO2uN8Zj MsB1EklGvkockjkThQfFjeU3zysxfsl4tQRROpglgs1MhiMZpvCdnNTn+Gr+YHQ0sueBxpk0 6GHMmE4C4jcrLFXwfEWOnk6ISHluZS1f9qRnNZzj9ICJItT7cn1KE54jZjmVudi4kOSbihaZ 70PrXMhwphz4nVxSRMIsXdNwSkpCrcOatDS4E7EjRgHW5+CP5WON2aJPD8htAtKNej5nPbTD NFJKz6nG4jdDpxtY0PEkMHarrBpWaIdPEPgdCew5Ns5cAZC0vMkJU8uuSoEumGqzUwdVIWaV p0iZ7MTaWSZeifx7Y6x5YkYR5dMGGzKW5GeLQ7emG0I1BTcFeP+VAGewLA5xWjnhABEBAAHC wWUEGAECAA8FAlP+V00CGwwFCQlmAYAACgkQ8XRKCUL1NscXDRAAx5Y5fyDX/HvFD2Sp4E1h HbtNsPfDQ076U3qvg31e7d5560fW5hwSgZ3++3kGRO7QFTq3edo3oEEWiHuHBJxT7eXSPhXY vsM/atDyQ7Zy6fEUVuv77b3TEDHuWvn4A2kBI+ApCzXohscTuCc/PvVCoDrRQ++be2hIlxsq M7DnRr3/6FI8chCUHj3bXeXtPp8M74/3RGCiwvPZ7825NU0JP6jj1yeEuy6ts/GaZFsDj3sz TxCeZxbdARFxqmT0lf3yF2uQWj35jwqBUC/S+07fZfRRRFXwHTSSzsTTg4JKvsEpq6v0/ryc ETXqwstze+nvPvZ/bLlkiFulLFajYdIky+t5JmjmMEW0/NpK4nRo/QNuTyF1vG1/tjg/q5CZ gW++ZBoYuOR/+roFRPEKIu6Aa843e+QXvpHv4WHN1viH7ufY3D389hKkkhtnEvUu1Mv7ITDD ryw3ok6KFBrjvMQVOp0L7yZc9RYtJ0mXH9n1wWJL94q4hbUJ7fqz2rv8RbTGQoqKfDQVuVuo gH36yLP9s+RKxePuqaTn6DOOHF0/hhMCwes9pWs6ObrEPUVG1YtM6TwopIUJtuLZFWWb3X1Q 4W8EyZElzlCZhLTVhJT7p2k8xBEjpdMpsHhBMjOMljCUeRdJX55g2DVAwx3fvhdOJkuXY5Tl nCgw1L3/lCt46SU=
- Ironport-phdr: 9a23:t7H3WBLu01vC7P7Rv9mcpTZWNBhigK39O0sv0rFitYgRI/nxwZ3uMQTl6Ol3ixeRBMOHs60C07KempujcFRI2YyGvnEGfc4EfD4+ouJSoTYdBtWYA1bwNv/gYn9yNs1DUFh44yPzahANS47xaFLIv3K98yMZFAnhOgppPOT1HZPZg9iq2+yo9JDffwdFiCChbb9uMR67sRjfus4KjIV4N60/0AHJonxGe+RXwWNnO1eelAvi68mz4ZBu7T1et+ou+MBcX6r6eb84TaFDAzQ9L281/szrugLdQgaJ+3ART38ZkhtMAwjC8RH6QpL8uTb0u+ZhxCWXO9D9QLYpUjqg8qhrUgflhjoZOT438G/ZicJ+g6xUrx2juxNy2IHUbJ2POfR5Yq/Qc9EXSGxcVchRTSxBBYa8YpMRAeoBJ+FYqJHyqFgPrRu7AAmjGvnvyjpSiX/w260xzuMsER3G3AM+GNICqnXVrNTwNKcXUOC416bIzTDZYPNX3Tfx8pTHchckofyVW797bMTfyU4qFwzfj1WQr5ToPzyU1uQRs2ib8vFvWfizhG4grgF8pCWkyMQ0ioTRmI4Z1lTJ+T96zYs1P9G0VlB3bcO+HJZRtSyWL4h7Tt44T211vCs3xacKtYC5cSQQ1Zgqxx3SZvqaeIaS+B3jTvyeITJgiXJlZr2/gxGy/FClyuLmV8m01E9GryRfktXRr3wN2R3e58qdRvty+Ueh3jmP1wTN5e1ePU80kq/bJ4Ygwr42iJUTrVzOEjHqlEjylqObdUUp9vK25+j5f7nqvJ6ROo1shgH7KKsum8i/AeoiMggJWmiW4fi81L398k3+WrlLj/w2kqzcsJDbPskbqau5DBVU0oYn8RqwEzCm0NEAkXkdMF1FYA6Hj5TuO1zWPP/4Cu6/j02wnzdv2vDJJabsAo7NL3jGiLfuZ6xx609ayAopzNBQ/YhYCr8bIKG7Zkikv9vBSxQ9LgacwuD9Cdw72JlaEW6PBaudOaXe9FSP/8opLvWGf4NTvyzyeNY/4Pu7sX+4nmgvfKys0IESYXa+VqB6I0ieJ2jth9IAOWgPtws6CuftjQvRAnZoe3+uUvdktXkAA4W8ANKbH9H/sPm6xC6+W6ZuSCVDA1GIH23vctzdCfgJYSOWZMRml25dDOTze8oazRir8TTC5f9/NOONo38XsJvi0J5+4OiBzUhvpwwxNNyU1iS2d08xnm4MQGVrjqV2oEg41U3alKYm3LpXEttc4/4PWQA/Z8bR
- Openpgp: preference=signencrypt
Hello,
Since no one from the development team has replied to you yet (or maybe
by private mail only), let me reply to say how welcome such initiatives are.
You may or may not be aware of the existence of a project to design a
new standard library (without backward compatibility):
https://github.com/coq/stdlib2
Maxime Dénès is leading this project. It should be a perfect fit for
your new string library. Thus, the module vs type class question should
be answered in coordination with this effort. The current version of the
standard library uses modules for the most part but I believe the
current trend would rather be to use type classes (without abusing them,
which can lead to problems).
There has also been specific discussion about the future of the string
library: https://github.com/coq/coq/issues/8483
I suppose your perspective would be very welcome there as well.
I want to emphasize once more how welcome these initiatives are. I hope
(and I will try to ensure) that the stdlib2 project will be properly
managed to fully benefit from this kind of individual initiatives.
Thanks,
Théo
On 06/11/2018 11:15, David Holland wrote:
> As a few people may already be aware, I've been for some working on
> and off (mostly off, alas) on a new string library for Coq, with the
> long-term goal of getting it merged into the standard library.
>
> The main design goals are basically to address the shortcomings in the
> existing string library: it's not very complete, it doesn't extract
> well, the character type is a hassle to work with, and code that
> matches on string values rapidly becomes horrible.
>
> I have a fair amount of code (and theorems) but I'm not really ready
> to post any of it yet. At this stage what I'd like is input on the
> following points:
>
>
> 1. It is fairly clear that the string code should be abstracted over
> multiple possible character types, both to handle narrow vs. wide (and
> multibyte) characters and also, optionally, to allow using the type
> system to strongly distinguish strings and characters of different
> encodings.
>
> At this point, is this best done with modules or typeclasses? Both
> seem to be feasible, but they produce quite different results. What is
> (now) preferred for the standard library?
>
> In the case of typeclasses there'd be one class for characters and
> another for strings (and maybe some elaborations of each), and various
> instances, but most of the material would be stated once, at the top
> level, in terms of the string class. In the case of modules there'd be
> module types for characters and strings, and modules for the various
> kinds of characters and strings, but most of the material written in
> terms of the abstractions would need to go in a functor to be applied
> separately to each of the instances.
>
> Either way, most likely some but not all the new material will be
> available to be applied to the existing string type, but I've reached
> the conclusion that worrying about this should not be a priority.
>
>
> 2. What should go into the character types to hold their identity?
> The existing ascii type uses eight bools, which is horrible to work
> with and should not be repeated. However, there aren't a lot of good
> choices: it should be bounded (since all standard character sets are
> finite) which rules out using the ordinary integer types directly; and
> it needs to be a binary integer type (otherwise it is impossible to
> write down the bound on Unicode code points without blowing the stack)
> which rules out the Fin.t that comes with the vector library.
>
> Reasonable choices seem to be either (a) wrap N (or Z, but N seems
> more natural) with a proof of being in bounds, like Fin.t wraps nat;
> or (b) use or develop some machine integer library that is not
> currently in the standard library. What's preferable? There should,
> ultimately, be machine integers in the standard library too, but
> that's a whole additional project, and if it's going to be done at all
> to support characters, it should really be done properly. (I'm not
> entirely averse to this, but getting it done in a reasonable amount of
> time is likely to be problematic.)
>
>
> 3. How does the existing character scope work, and what's involved in
> generalizing it to work with multiple character types and/or multiple
> character encodings? It seems to be compiler magic, but that's as deep
> as I've gotten so far.
>
>
> 4. The support for recursive/list notations does not seem to be
> powerful enough to allow defining a matching syntax for strings
> similar to OCaml's. (That is, one where patterns in a string match are
> string constants and the match takes place by string comparison.)
> Since one of the problems with the current string library is the need
> to write
> match foo with
> | String "f" (String "o" (String "o" EmptyString)) => ...
> :
> end
> and improving this with notations doesn't seem to be feasible either,
> it would be nice to have _some_ scheme in hand that allows writing
> nicer-looking code, but I don't know what it should be.
>
> I'm inclined to think that because we want to be able to extract to
> OCaml strings, and OCaml strings aren't an inductive type (so doing
> matches of this kind produces awful code), the recommended method
> ought to be a series of if-comparisons; but because if is second-class
> relative to match (in some respects) this is not entirely satisfactory
> either.
>
> One might be able to arrange something like
> match string_match foo ["foo"; "bar"; "baz"] with
> | 0 => Some foo
> | 1 => Some bar
> | 2 => Some baz
> | S _ => None
> end
> or even just
> string_match foo [
> ("foo", Some foo);
> ("bar", Some bar);
> ("baz", Some baz);
> ] None
> but this isn't very nice without notation to support it and these
> approaches both have their own problems.
>
> Thoughts?
>
>
> 5. To avoid having weird things happen when extracting to OCaml
> strings, and also on general principles, I think it's best to continue
> to have a separate string type and not use lists of characters. It is
> easier and less error-prone to fold strings and lists of characters
> together when extracting to Haskell than to try to separate them on
> the fly when extracting to OCaml.
>
> So I think it's best to keep the structure of the current string type
> (an inductive type with its own nil and cons), and just avoid exposing
> it as much as possible to avoid extracting to awful OCaml code.
>
> For a while I was trying to find a way to carry over list theorems
> without having to reprove them, or find a way to prove one set of
> theorems about an abstract nil/cons type that could be shared between
> lists and strings, but eventually came to the conclusion that this
> isn't possible, or is expensive enough to be not worthwhile.
>
> Long-term feature request: it would be nice to have a way to rename
> the constructors of an inductive when instantiating a module;
> implicitly would be nice but explicitly would be good enough.
> Currently you cannot do this and cannot e.g. write a module signature
> that will match both list and (the existing) string. Similarly, it
> would be nice to be able to reason about constructors in typeclasses;
> currently you can write a typeclass that abstracts over list and (the
> existing) string, but you need a pile of explicit lemmas to carry the
> information encoded in the inductive structure of the type, and it's a
> fairly substantial nuisance both to set up and work with.
>
>
> 6. ... there are other things to bring up sometime but this email's
> long enough already.
>
>
Attachment:
signature.asc
Description: OpenPGP digital signature
- [Coq-Club] strings., David Holland, 11/06/2018
- Re: [Coq-Club] strings., Jason -Zhong Sheng- Hu, 11/06/2018
- Re: [Coq-Club] strings., Théo Zimmermann, 11/08/2018
- Re: [Coq-Club] strings., Daniel Schepler, 11/08/2018
Archive powered by MHonArc 2.6.18.