Writing text with Umlaute in Json file - reads back wrong

Started by ubacher, October 18, 2017, 02:52:38 PM

Previous topic - Next topic

ubacher

I have  used the sample app: Files to write a text file and then read it back. I modified the app to write
some German Umlaute at the end: "message" : "Hello from IMatch äöüß",

When I the read back this json file it does not show the Umlaute correctly. What's wrong?

I attach the demo.json file for others to try. ( remove the .txt from the name before using)

I have set: WIN 10 english, Region:Austria, Language English(UK)

Mario

This is what I see in my editor. All German umlauts are perfectly OK.
The umlauts are also OK in Windows Notepad.



You need to use an editor that handles UTF-8 encoding.
Which editor did you use?

-- Mario
IMatch Developer
Forum Administrator
http://www.photools.com  -  Contact & Support - Follow me on 𝕏 - Like photools.com on Facebook

ubacher

I think it might be a problem with the read, not the writing of the JSON file.
Output from files app:

Mario

Please file a bug report so I can look into this for a later release.
-- Mario
IMatch Developer
Forum Administrator
http://www.photools.com  -  Contact & Support - Follow me on 𝕏 - Like photools.com on Facebook

Mario

Save it. Fixed already for the next release.
The read file endpoint did not consider files to be UTF-8 when no BOM was included. But this is optional and the Unicode Standard permits the BOM in UTF-8,[3] but does not require or recommend its use (https://en.wikipedia.org/wiki/Byte_order_mark, http://www.unicode.org/versions/Unicode5.0.0/ch02.pdf).
-- Mario
IMatch Developer
Forum Administrator
http://www.photools.com  -  Contact & Support - Follow me on 𝕏 - Like photools.com on Facebook

jeg

I have maybe a similar problem too. I use ghostscript to extract the text content of a pdf-file. In this file I can see the Umlaute like this
Für 1 R
but when use IMatch.readTextFile I get the following
F�r 1 R
as the result. Will be the problem solved with the next release?

Mario

You read a PDF file with readText in IMatch? This cannot work.

Or do you produce a text file from GS? If so, make sure it's in standard Windows UNICODE or when you need to produce an UTF-8 encoded file, make sure it has a BOM header.

readTextFile currently interprets files without a UTF-8 or UNICODE BOM at the beginning as standard Windows ANSI-encoded. The next version interprets files without a BOM as UTF-8-
-- Mario
IMatch Developer
Forum Administrator
http://www.photools.com  -  Contact & Support - Follow me on 𝕏 - Like photools.com on Facebook