coq-club AT inria.fr

Subject: The Coq mailing list

List archive

Re: [Coq-Club] Unicode vs. homographic attacks vs. Coq trust

From: Clément Pit--Claudel <clement.pit AT gmail.com>
To: coq-club AT inria.fr
Subject: Re: [Coq-Club] Unicode vs. homographic attacks vs. Coq trust
Date: Mon, 7 Mar 2016 16:54:45 -0500
Authentication-results: mail3-smtp-sop.national.inria.fr; spf=None smtp.pra=clement.pit AT gmail.com; spf=SoftFail smtp.mailfrom=clement.pit AT gmail.com; spf=None smtp.helo=postmaster AT mout.kundenserver.de
Ironport-phdr: 9a23:2dSUVBI8zDBu4Ep3WtmcpTZWNBhigK39O0sv0rFitYgULvvxwZ3uMQTl6Ol3ixeRBMOAu60C27Od6v25EUU7or+/81k6OKRWUBEEjchE1ycBO+WiTXPBEfjxciYhF95DXlI2t1uyMExSBdqsLwaK+i760zceF13FOBZvIaytQ8iJ35vxh7n5osCKKyxzxxODIppKZC2sqgvQssREyaBDEY0WjiXzn31TZu5NznlpL1/A1zz158O34YIxu38I46FppIZ8VvDxeL19RrhFBhwnNXo07Yvlr0rtVwyKs1AY02AblAZ/OwnZqVTRWp7svib+/r523CSfMMvqC6g1RRyt6q5qTFnjjyJRZG1xy33elsEl1PETmxmmvREqm4M=

On 03/07/2016 02:15 PM, Jonathan Leivent wrote:
> Another take on the issue of how best to get Unicode into proof
> scripts is: maybe sometimes it should not be allowed?
>
> Consider the problem of homographic attacks (as with:
> https://en.wikipedia.org/wiki/IDN_homograph_attack).
>
> Could this become an issue with trust in the case of Coq vs. the use
> of possibly homographic Unicode characters in Coq scripts?

I don't think so. Homograph attacks are intended to fool humans, for example
by convincing an unsuspecting victim to log into ‘exаmple.com’ instead of
‘example.com’.
Of course, this type of attacks is powerless against robots: to a computer,
‘exаmple.com’ and ‘example.com’ don't look the same at all. For example, my
computer currently complains about ‘exаmple’ being misspelled.

Tricking humans into trusting your Coq proofs isn't very useful. For example,
if you show me one of the following proofs, I might temporarily believe you;
but when the trick is discovered, your reputation will suffer; just like in
real mathematics when someone tries to pass a phony proof as true.

(* Homograph trick: *)
Definition Fаlse := True.
Goal Fаlse. red; auto. Qed.

(* Plain overriding of the constant ‘False’ *)
Definition False := True.
Goal False.
Proof. red; tauto. Qed.

(* Using a notation: *)
Notation "'False'" := (True).
Goal False.
Proof. tauto. Qed.

On the other hand, tricks that also fool computers, such as bugs in the Coq
Kernel (and in the independent proof-checker), are much more dangerous; they
prevent you from using Coq in an adversarial context. Fortunately, it's very
uncommon to work in an adversarial context where claims are backed using Coq
proofs, but checked by humans.

> (...) With Unicode, there are too many similarities and near-similarities,
> even
> among mathematical symbols. A similar issue occurs when there are distinct
> input methods (such as Company-Coq's auto-folding of "<=" vs. latex folding
> of
> "\leq") that leave the underlying representation different while producing
> the
> same appearance.

This sort of overloading is already very common in Coq, using notations (in
particular for arithmetic operators such as ‘+’, ‘-’, and ‘*’, as well as
numbers). In fact, Company-Coq's folding is not very different from adding a
small collection of notations to Coq; you can even get some of them (about a
third) by simply importing the Utf8 module in Coq.

Attachment: signature.asc
Description: OpenPGP digital signature

[Coq-Club] Unicode vs. homographic attacks vs. Coq trust, Jonathan Leivent, 03/07/2016
- Re: [Coq-Club] Unicode vs. homographic attacks vs. Coq trust, roux cody, 03/07/2016
- Re: [Coq-Club] Unicode vs. homographic attacks vs. Coq trust, Clément Pit--Claudel, 03/07/2016