Having an issue with character encoding in some places. To reproduce: File -> Properties -> Business -> Company address Enter text with accents, e.g. "á" Click OK. Open again and it becomes "á" Also shows up elsewhere, such as custom reports. In Windows 10 with en_US as locale.
*** Bug 796804 has been marked as a duplicate of this bug. ***
The duplicate bug has a screenshot visually illustrating the problem.
The root cause of this bug may be the same as eventually found in bug 796728 : guile seems to want strings encoded in the system's locale where gtk uses utf8 by default.
*** Bug 797069 has been marked as a duplicate of this bug. ***
Experiments on Guile for Windows -------------------------------- Gnucash v3.5 Uses Guile-2.0.14 Windows set to English (Australian) environment - no LANG= setting (In case bugzilla munges unicode I'll repost with unicode-char rewritten "#") I run a report eg account-piecharts.scm, set Report Title to "Turkish Lira - ₺ Lira" and report-currency = TRY (symbol = ₺) As we know guile-2.0 munges unicode in string-ports functions, therefore (format #f ": ~a" str) will munge unicode, as well as (with-output-to-string ...) in html-string-sanitize. Consequences (current state of maint): -------------------------------------- Report shows title as "Turkish Lira - ? Lira Assets: ?1,000" The Tabbed window title shows "Turkish Lira - ₺ Lira" saved-reports-2.8 (written by guile) shows "Turkish Lira \u20ba Lira" book.gcm (written by C) shows "PageName=Turkish Lira ₺ Lira" book.gcm (SchemeOptions) encodes title as "Turkish Lira \u20ba Lira" Bugfix Attempt 1 ---------------- We could fix this report via (string-append ": " str) and (open-output-string) but I think this is the wrong approach because we'll need to hunt *every* string-port function to modify it. It side-steps the issue. Otherwise this approach is harmless. Mark Weaver Monkey-patch Bugfix attempt 2 ----------------------------------------- see http://lists.gnu.org/archive/html/guile-user/2019-04/msg00025.html whereby string-ports functions are redefined to handle strings as UTF-8 instead of locale. This has interesting consequences: 1. the existing saved-reports-2.8 and book.gcm, having encoded unicode as \uNNNN, are properly read back as extended chars. 2. the reports are fixed. Title is "Turkish Lira ₺ Lira (Balance ₺1,000)" the tabbed window title is still fine "Turkish Lira ₺ Lira" saving *again* into saved-reports-2.8 writes as "Turkish Lira ₺ Lira" (utf8) saving *again* into book.gcm writes (C part) PageName=Turkish Lira ₺ Lira but (scheme part) SchemeOptions= ... "Turkish Lira ₺ Lira" 3. relaunching GnuCash and reloading these UTF8 strings leads to: title is reads "Turkish Lira ₺ Lira" whereby ₺ is now #xe2 #x201a #xba) otherwise report is still working well loading from saved-reports-2.8 is working perfectly well. Source code review ------------------ gnc-plugin-page-report.c writes PageName using g_value_set_string (gnc-plugin-page.c:604) gnc-plugin-page-report.c writes SchemeOptions using g_key_file_set_value (gnc-plugin-page-report.c:862) saved-report-2.8 is written in gnc-report.c (various) PageName is read back at (unsure where) SchemeOptions is read back with g_key_file_get_value (gnc-plugin-page-report.c:923) saved-report-2.8 is read back with gfec_try_load (gnucash-bin.c:364) Conclusion ---------- I wouldn't think applying monkey-patch is safe due to problems reading SchemeOptions, nor hunting all string-ports functions to rewrite them is the right approach (it's a bandaid). I think upgrading to guile-2.2 will fix all these issues. Monkey-patch: ------------- Paste the following somewhere general eg utilities.scm: "(when (string=? (effective-version) "2.0") ;; When using Guile 2.0.x, use monkey patching to change the ;; behavior of string ports to use UTF-8 as the internal encoding. ;; Note that this is the default behavior in Guile 2.2 or later. (let* ((mod (resolve-module '(guile))) (orig-open-input-string (module-ref mod 'open-input-string)) (orig-open-output-string (module-ref mod 'open-output-string)) (orig-object->string (module-ref mod 'object->string)) (orig-simple-format (module-ref mod 'simple-format))) (define (open-input-string str) (with-fluids ((%default-port-encoding "UTF-8")) (orig-open-input-string str))) (define (open-output-string) (with-fluids ((%default-port-encoding "UTF-8")) (orig-open-output-string))) (define (object->string . args) (with-fluids ((%default-port-encoding "UTF-8")) (apply orig-object->string args))) (define (simple-format . args) (with-fluids ((%default-port-encoding "UTF-8")) (apply orig-simple-format args))) (define (call-with-input-string str proc) (proc (open-input-string str))) (define (call-with-output-string proc) (let ((port (open-output-string))) (proc port) (get-output-string port))) (module-set! mod 'open-input-string open-input-string) (module-set! mod 'open-output-string open-output-string) (module-set! mod 'object->string object->string) (module-set! mod 'simple-format simple-format) (module-set! mod 'call-with-input-string call-with-input-string) (module-set! mod 'call-with-output-string call-with-output-string) (when (eqv? (module-ref mod 'format) orig-simple-format) (module-set! mod 'format simple-format))))"
addendum to Mark Weaver's monkeypatch section above in case it's not clear: 4. therefore SchemeOptions part in book.gcm, although it has been written using UTF8 properly, is not read back by C correctly, and is completely munged again (and is not guile's fault!)
Aaron please test a recent nightly from https://code.gnucash.org/builds/win32/maint - I think this bug is considered fixed for next release.
I tested the build for 3.5 from 5/21 and the problem seems to be fixed. Thank you.