Skip to Content.
Sympa Menu

coq-club - Re: [Coq-Club] [Emacs] correct offsets in .glob files when unicode characters are present

coq-club AT inria.fr

Subject: The Coq mailing list

List archive

Re: [Coq-Club] [Emacs] correct offsets in .glob files when unicode characters are present


Chronological Thread 
  • From: Stefan Monnier <monnier AT iro.umontreal.ca>
  • To: Abhishek Anand <abhishek.anand.iitg AT gmail.com>
  • Cc: coq-club <coq-club AT inria.fr>
  • Subject: Re: [Coq-Club] [Emacs] correct offsets in .glob files when unicode characters are present
  • Date: Thu, 23 Mar 2023 17:23:21 -0400
  • Authentication-results: mail2-smtp-roc.national.inria.fr; spf=None smtp.pra=monnier AT iro.umontreal.ca; spf=Pass smtp.mailfrom=monnier AT iro.umontreal.ca; spf=None smtp.helo=postmaster AT mailscanner.iro.umontreal.ca
  • Ironport-data: A9a23:az5fxqn6VqAw+/w5D+QENwjo5gwrIkRdPkR7XQ2eYbSJt1+Wr1Gzt xJJDG6PafmKMTbyftsibYu//UICv8XUyNNhHldq/HsxQVtH+JHPbTi7BhepbnnKdqUvb2o+s p5AMoGYRCwQZiWBzvt4GuG59RGQ7YnRGvymTrSs1hlZHWdMUD0mhQ9oh9k3i4tphcnRKw6Ws LsemeWGULOe82MyYzJ8B56r8ks156yp4mhA5TTSWNgS1LPgvylNZH4gDfrpR5fIatE8NvK3Q e/F0Ia48gvxl/v6Io7Nfh7TKyXmc5aKVeS8oiI+t5uK3nCukhcPPpMTb5LwX6v4ZwKhxLidw P0V3XC5pJxA0qfkwIzxWDEAe81y0DEvFBYq7hFTvOTKp3AqfUcAzN1/B14qF5UZ0NxpJiZk1 NgkIR5KdxW60rfeLLKTEoGAh+wmJcjvJ4gWvHdt13fYF/FgXJXERbnQ6NZcmjw57ixMNa+FP YxDMWYpNkmGPkUn1lQ/UPrSmM+tgX/5cjBCgFOPouwq5m/V0BZ82byrO9O9ltmiFJUFxxjD+ j6uE2LROBYQbtOe5gu5qHeCiP+XvXP6VN0RC+jtnhJtqAfKnD1MVEF+uUGAifK+kwu1X89VA 1cF/zIn66k07k2iCNfnNyBUu1aBtx8YQNFZFes38keM0K2S/gOeAHQeQzdFLtcv3CMredA0/ k6CgM/DHT1FiqOyDli80oudg3SrPBFAeAfuehQ4ZQcC5tDipqQ6gRTOUstvHcaJYjvdQmiYL 9ei93NWulkDsSIY//nmowif2lpAsrCMEVVovVmNNo6wxlohDLNJcbBE/nDwxJ6sxq6wSFuGo RDocOC+tcRm4X1gWEWwrAglGbio4+2OOTvajEQpHoMmsSmo/Hi/Z41Z5Hd1LS+F0/romxe3O yc/WisIufe/2UdGi4ctPupd7OxxlcDd+SzNDKy8Uza3SsEZmPW71C9vf1WM+GvmjVIhl6oyU b/CL5n9UCdEU/88lWDrLwv47VPN7n1mrY80bc2npylLLZLFPxZ5tJ9caQvQM75hhE96iFWNo 4432zS2J+V3CrSkM3aLreb/3HgQIGM2DI2+p4NWcOCEOA9vBGAmDeSZ2rggdop5nqpUj/rF5 TmxRwdB1Vf1iHaPaR+LcRhehEDHAP5CQYYAFXV0Zz6AgiF7Ca72tfd3X8ZpIdEPqbIypdYqF KNtU5vbXZxypsHvpm51gW/V9tIyLXxGRGumYkKYXdTIV8I5F1GUooe7LmMCNkAmV0KKiCf3m JX4viuzfHbJb10K4B/+Oav/nWCi92MQgvxzVEbuK9xeMhekuotzJiC7yrd9L8gQIF+RjnGXx iSHMyc+/OPtmo4S9MWWpKamq4zyLfByMHAHFEbm7JG3FxLgwEyd/aF6Xt2lQxXhRULv2aD7Z exq3/D2a/IGu1BRsrtDKbVgzINgxt62p7Zl3hZVRlvWZVGVG5dlJmiMzJQWvJIc27N5mw+HZ V2Sy4N3OJGiGsDsIHgOLiULM8WB0vA1nGHJzPIXeU/V2g5+zIClY25zYSaerTN7LaRnFa8Y2 sIjhZIm0BO+gR8ULdq2tCBY2GCSJHgmUa98lJUlLKL0qwgskHduXIf9D3Lo3ZSxdNl8CEkmD TuKjq7khb4H5E7jcWI2JEfdz9hmmpUClxBb/mAsf23TtIL+ucY2+xlN/RAcbAdflExH2t0uH FlbDRR+IKHW8gp4gMRGYXuXJDhAIx+nq2jR0FoCkVPLQ3a4DlLtKHIPAsfT3UQ73V8FQB1l0 uC29GLXXwzuXvnN5QopeEs8q/XcXd17rQLDv8a8HvW6JZoxYBu7o6qqdGAZ9kLtPpIsiG3lo dhV4/tBM4D6EQUyoKQLLZaQ+ppNaRKDJU1EGepA+oFQF07iWTiC4xq8AGHvRdFoONrx7l6eN 8xiAultRiaO/n+CgR5DDJFdPoIuuuAi4eQzX4/CJEkEguO5lSVoupeBzRrOrjYnbPs2mPlsN 76LUSyJF1GRon5mm2Xtis1gEUjgaPkmYDzM5syEwN8rJbki7t41KVoT14GqtUq7KAFkphKYn D3Sbp/sktBN995eoJvOII5iWSOEccj+RcaZwjCV6t5uV+7CAe3KlgESq2TkAThoAKssa4x3u ImJ4fHK3xLjnbcpUmrmtYGLOItX6O6TAudGEMLFA0NLvCmFWcbcwQUn/V6kI8YUyNhY2Mmtb Fa6WfCRbv8QYc9Wn1dOWhhdEjEcKqX5VbjhriWDtMaxCgAR/AjEDdG//1r7RDh/Wg5RHLOmE S7ymfKlxu4AnbR2HBVeWs1XWc5pEmHsSY4NVoPXtwDBKkKKn1nbmL/ptSR42AHxEnPeTfrLu 8PUdCPfKia3lrrDluxCkopIuRYSMnZxrM8wcm8Z+P90kzqKN3EHH8tMLaQ5DoxorQKq2KHaf D3taE4QORf5VxlAcjT+543tYF7OTKhGcNL0PScg8E6oej+7TtHISqdo8iB7pWx6YH3/xeWgM ssT4WD0IgP3+JxyWOIP/bautI+LHB8BKq4goigRUvAeAiryxZ0P3X1lBwFEUyrKCYfMjkKNO G0yQ3xeTUi/D0X4eSqll7i5BzlB1A4DDR1xBctM/Dobk4SBy6tdzfr5J/v+27lFZ81iyHsmW ybsX2XUi4yJ8iV7hEbq0u7FRYd1E/XNA829KrP5SAQW2aq5goji0wXuggJXJPwfFMVj/58xW 9ViD7XSxKhIFayJ5ICr9A==
  • Ironport-hdrordr: A9a23:BQeCy63A1gZpyFv+ZY+ysgqjBJYkLtp133Aq2lEZdPWaSK2lfq GV7Y0mPHrP5Qr5N0tQ/exoVJPufZqYz+8X3WBzB8bFYOCFghrLEGgK1+KLqFeNJ8S9zJ856U 4KSchD4bPLY2SSwfyKhzWQIpIb+fGitICOqsG29QYUcShaL5tYxyM8LyqlPmgefng8OaYE
  • Ironport-phdr: A9a23:Y/mnMRV1r4rbpDRms7jd6kNlXnHV8KwRXzF92vMcY1JmTK2v8tzYM VDF4r011RmVB9idsa4YwLOK7ejJYi8p39WoiDg6aptCVhsI2409vjcLJ4q7M3D9N+PgdCcgH c5PBxdP9nC/NlVJSo6lPwWB6nK94iQPFRrhKAF7Ovr6GpLIj8Swyuu+54Dfbx9HiTajYb5+N gu6oAreu8UZnIdvJac8wQbVr3VVfOhb2WxnKVWPkhjm+8y+5oRj8yNeu/Ig885PT6D3dLkmQ LJbETorLXk76NXkuhffQwSP4GAcUngNnRpTHwfF9hD6UYzvvSb8q+FwxTOVPczyTbAzRDSi8 6JmQwLmhSsbKzI09nzch8pth6xZvR2hvQRyzY7KboyLKfRxfKDTfd0US2VPUcleSzdMAp+gY 4YVE+YNIfxVo5X7qlATrRW+Hw6sBOb3xzFVmn/5w6M63P4nEQrb2wIgGNQOsHLVrNrrLqcZT Oe4zKzPzTXFafNW2Cny6JLTfxAgu/GMWqh8cdfUyUkoEQPFkk+QppL/MDORzOgCr3KU4vNmW OmyhGEptxt/rSKzxscwlIbJnIQVx0ja+Sh93Is7K8G0RUp7bNCrEpZdsyCXOpdoT84hQWxlu Dg3xLMHtJC0fCYHypspyh3bZvGDbYSF/BHuWemNLDl3gn9uZbGxhw6q/ES9xODwSNO43EhWo idKiNXAqHAA2wbT58SbUvdx4Fqt1DeV2wzO6uxJLlo4mKTUJpI73LI9mZweulnZECDsgkX5l qqWe10k+ue27+TnZa3rppqGOI91jgH+Kbghlda5AeQ+LAcORXKX+eWm1L3k50H5XbBKjvowk qnFqpDVO94UpqqjDwBJyIkv8xe/DzG439QEhXQLMVxIdAydg4T3J13DL+r0Aem/jli2kDpn2 +jKPrj7DZXMKnjDnq3hfbF460NEzQozytZf55RVCr4fOvL+QU7xtMbfDhMjLgy0xuHnCNNh1 oMER22AGKuZPLnOvlCS/u4vJfKDa5cPuDnhM/gl++LujXghlFMAZaWpx4cYaGikHvR6JEWUe WbjgtAYEWsTogU+SPHqh0aZXD5IZ3eyWro86SshBIKnC4fDXIGtj6ab0Ce1BJ0FLlxBX3mGE Xbzd4iHE94KYSSeaptomDwFTrisSMko0xiouEn7yqZoBuXR8ywc85nk0Y4myffUkEQJ6TF6B sLV9maLSWB5hCtcQjgw2qFyu2R80FDFzK1/heBCGNVXofhABFRpfaXAxvB3XoihEjnKec2EH Q7OqrSOBDgwSoh02NoSewNnHN7kiBnf3i2sCrtTlrqRBZVy/LiPl2PpKZNbzHDLnLIkk0FgW tFGYGihgKh++hL7BpTO1VickKC2b6kV2GjG/TTL1nKA6XlRSxU4SqDZRTYab0rSo87+4xbHS LmoCLk9GgpbzoifL6xMdsfkhFEAT/6wcM/GbTeXnGG9TQ2N2qvKbIfufDAF2z7BDUEfjw0J1 XOPNAwlDS6npW/ESj10EhTyZkTq7fNzoXf9RUZcIxiiSUpn2vL1/xcUgabZUPYPxvcevyxnr TxoHVG71taQCtyapgMncr8OKdU6qExK027UrWkfdtSpMrxii1gCcg92o1Km1hN5DZ9FmNQrq 3VixRR7KKaR2ldMPz2C2pW4NrrSI2j0tBehDsyeklTZ1tCX970n6e4/7Ujmuwe1DEcr9zNs2 pgd0neR4InLEBtHSYj4ASNVv1Bxo7DXZDV45puBjCY2d/Ps7nmYhpR0XLhAqF7oZdpUPaKaG RWnFsQbA5LrM+k2gx2zaRlCOulO9akyNsfgdv2c2aftMvwz+VDuxWlB/o151VqBsiRmTeudl ZMCxfeZ0xGvVi37ykqktcbrg41NYXcZFyDsrEqsTJ4UfaB0cYsRXC2rJMuxx9hkr5/3XDhF8 VmlG0kL0cvvchPYPDmflUVAkE8Qp3Kggy6xyTd5xioooqSo1yvL2+3+dRADNwanXUFahEz3a cixhtEeBg2zahQx0QGi7gD8zrRao6J2Ky/SR11Jdm74NTMqXqy1v7uEK8lBjfFg+SBQVuK9b EqyS6T65QYf1CX/BWZXwHYwfnmmt474kBpzlG+GZC8o9jyDIZ02n0qZuYSUTOU0vHJOXCRij DjLGlWwd8Kk+9mZjdaLs+yzUX6gSowGdCDqyY2asy7orWZuABC5g7Wygoi+SlJ8i3enkYM3E 3iQ/3OeKsHx2q+3MPxqZBxtDV74sY9hH51m15E3j9cW0GQbgZOc+TwGl33yOJNVw/GbDjJFS DgVztrS+AWg1ldkKyfDwoX/UHSQ2ONgfd78f2YR3D4n4slOTqydpu8h/2M9sh+joATdbOIo1 DIaz/0v5WQygvsO/hcoySOBGL0bGQ9TNGa/8nbAp8D7p6JRamG1dLG230crhtGtAoaJpQREU Wr4cJMvTmdgq99yO1XW3Djv+5npLZPOOMkLuETewHKix6BFbYg8nf0QiW97NHLh6Dc7nvUjg 0Um1MO/sZKMNn9guqOjBVhFMzrze9ke8zWrhq87/I7e1tKqF5RlQmxNXYHvC++tFzQOr/nuM 0CFGXU1+HKSGL6ZHBWYrlpjqHTTCZ2iMzecLRx7hZ1rFhyUJUhEiwkSWjgg2J8jEUa3wcvnb F107zRX7Vewqx1Hzv9kOkviV2DZuB2lYzNyQ5HXOhNS6ksqC179F8uY46ozGihZ+sfktwmRM imBYA8OC2gVW0uCDlSlP7+05NCG/fLKTuy5Z+DDZ7mDs4k8H7+B2I6v3416/j2NKtTHP39sC Oc+01ZCWnYxEtrQmjEGQSga3yzXaMvTqBC58yxx5sexlZajEBro/peKAqBOPM9H/hm3hbyIP uiWiTw/KCxfkI4JwnnU0rUW2BgZgmAmdjWgF6gBqT+YTK/UnfwybVZTYCdyOc1UqqMki1AWY 4iB0o+zjOM+06ZmbjUNHUbskcyoe8EQdmS0NVedQV2OKKzDPzrAhcf+faK7T7RUyuRSrRy5/ ziBQCqBdnyOkSfkUxe3PKRCliaeaVZbv4e7cxt3IWn5S5T7bxq9LMV6hDlwyrR+1RaofSYMd CNxdU9AtOja9SRDnvB2AHBM9FJgJOiAgC2Q6e/VMNAXq/wtHyFzkf5A7X0+jbBcpnIhJrQ9i G7ZqdhgpEujm++Ex298URZAnT1MgZqCoURoPaixHnZoWGzDug8I6mOMERkDo51uA4+200i14 tnVk+TuLTBE78jZ9M9aDMGGcKpv0VIkOBvtAzXdCg0IV3iqL2aZmklalu2I+3SR6JMz+MCEp Q==
  • Ironport-sdr: 641cc350_KtDvw8AUggDA+UBcGmMTej+3M1tdN3PMfMJiqsqFvCkTJnc UbCA4vDfAvt8yvaw7qLqccoyaRnO6ddn5thZyzw==

> Is there a way to get coqc to produce the "correct" offsets in .glob
> files:where it doesnt count unicode characters as multiple chars?
> Alternatively, is there some variant in goto-char in Emacs that can account
> for the double counting of unicode characters?

You need to convert between "bytes in the file" and "chars in Emacs
buffer". The conversion between the two depends on the
coding-system used and can be non-trivial.

When you use `byte-to-position` you convert between "bytes in Emacs
buffer" and "chars in Emacs buffer" which is not quite the same.
In many cases this will happen to work right, tho: Emacs's internal
representation is a variant of UTF-8, so if your file's coding-system is
also UTF-8 the two will usually coincide but if your file uses something
different then you're out of luck.

If you want to convert between a file's byte-positions and its buffer's
char positions you want `filepos-to-bufferpos` and `bufferpos-to-filepos`.

Note also that all those functions counts bytes (and chars) from the
beginning of the buffer rather than from the beginning of the line, so
you will probably want to do a gymnastics like

(defun my-goto-line-byte (bytes)
(let* ((bolc (line-beginning-position))
(bolb (bufferpos-to-filepos bolc))
(targetb (+ bolb bytes))
(targetc (filepos-to-bufferpos targetb)))
(goto-char targetc)))

Another option is to do something like:

(defun my-goto-line-byte (bytes)
(goto-char (line-beginning-position))
(let* ((linec (buffer-substring (point) (line-end-position)))
(lineb (encode-coding-string line buffer-file-coding-system))
(prefixb (substring lineb 0 bytes))
(prefixc (decode-coding-string lineb buffer-file-coding-system)))
(forward-char (length prefixc))))

This last option is fundamentally the only "reliable" option, and the
`filepos<->bufferpos` functions may en up using such a thing internally
(and if so, encoding/decoding the whole freaking buffer, thus making
this operation O(buffer-size) :-( ).


Stefan




Archive powered by MHonArc 2.6.19+.

Top of Page