photools.com Community

IMatch Discussion Boards => General Discussion and Questions => Topic started by: anmue on November 10, 2016, 11:15:19 PM

Title: Spell Checker - wrong encoding of dictonary?
Post by: anmue on November 10, 2016, 11:15:19 PM
Hi,

I just
* downloaded the german OpenOffice dictionary called dict-de_de-frami_2015-12-28.oxt,
* extracted the dictionary files de_DE_frami.aff and de_DE_frami.dic to the folder where Imatch expected the files and
* renamed them to de_DE.aff  and de_DE.dic

Then I started a simple test and found that misspelled words with german umlauts (e.g. Öterreich) will show some wrong charachters in the hint list. It seems to me that there is an error with the correct encoding. See attached screenshot.

Or what else did I wrong?

Regards Andreas
Title: Re: Spell Checker - wrong encoding of dictonary?
Post by: Mario on November 11, 2016, 12:34:35 AM
This looks like the file you have downloaded is not coded with the correct character set.
Probably your browser did not save it right.

Where did you download from?
I use a German spell checker from Open Office with IMatch myself, and I have never seen this kind of problem.
Title: Re: Spell Checker - wrong encoding of dictonary?
Post by: anmue on November 11, 2016, 12:28:20 PM
Hi Mario,

I used this link for the download http://extensions.openoffice.org/en/download/18425 (Which you can find here http://extensions.openoffice.org/en/project/german-de-de-frami-dictionaries).

Usually I have no problems with my Firefox downloads.

I looked after the encoding of the two files with Notepad++ and it says that it is ANSI.

I then changed them to UTF-8. When I use them, I get a different (wrong) hit list. See 2nd screenshot.

Regards Andreas
Title: Re: Spell Checker - wrong encoding of dictonary?
Post by: Mario on November 11, 2016, 04:17:25 PM
Forcing another character set will not set the mangled data right.
it just converts the rubbish you have into another rubbish.

How did you download the file?
Save as...?
It seems hat the way you have used to download the file saved it as ANSI, breaking the UTF-8 encoding.
Try to download again, with "Save as"...
Title: Re: Spell Checker - wrong encoding of dictonary?
Post by: anmue on November 11, 2016, 09:27:20 PM
Hi Mario,

I tried different browsers, different sources, different methods of downloading, but I didn't get an UTF-8 encoded dictionary from OpenOffice sources.

But I finally found a source, where I got an UTF-8 encoded dictionary: https://github.com/titoBouzout/Dictionaries

It's not the latest Version, but works.  :D

Thanks for your support.

Andreas