Spell Checker - wrong encoding of dictonary?

Started by anmue, November 10, 2016, 11:15:19 PM

Previous topic - Next topic

anmue

Hi,

I just
* downloaded the german OpenOffice dictionary called dict-de_de-frami_2015-12-28.oxt,
* extracted the dictionary files de_DE_frami.aff and de_DE_frami.dic to the folder where Imatch expected the files and
* renamed them to de_DE.aff  and de_DE.dic

Then I started a simple test and found that misspelled words with german umlauts (e.g. Öterreich) will show some wrong charachters in the hint list. It seems to me that there is an error with the correct encoding. See attached screenshot.

Or what else did I wrong?

Regards Andreas

Mario

This looks like the file you have downloaded is not coded with the correct character set.
Probably your browser did not save it right.

Where did you download from?
I use a German spell checker from Open Office with IMatch myself, and I have never seen this kind of problem.
-- Mario
IMatch Developer
Forum Administrator
http://www.photools.com  -  Contact & Support - Follow me on 𝕏 - Like photools.com on Facebook

anmue

Hi Mario,

I used this link for the download http://extensions.openoffice.org/en/download/18425 (Which you can find here http://extensions.openoffice.org/en/project/german-de-de-frami-dictionaries).

Usually I have no problems with my Firefox downloads.

I looked after the encoding of the two files with Notepad++ and it says that it is ANSI.

I then changed them to UTF-8. When I use them, I get a different (wrong) hit list. See 2nd screenshot.

Regards Andreas

Mario

Forcing another character set will not set the mangled data right.
it just converts the rubbish you have into another rubbish.

How did you download the file?
Save as...?
It seems hat the way you have used to download the file saved it as ANSI, breaking the UTF-8 encoding.
Try to download again, with "Save as"...
-- Mario
IMatch Developer
Forum Administrator
http://www.photools.com  -  Contact & Support - Follow me on 𝕏 - Like photools.com on Facebook

anmue

Hi Mario,

I tried different browsers, different sources, different methods of downloading, but I didn't get an UTF-8 encoded dictionary from OpenOffice sources.

But I finally found a source, where I got an UTF-8 encoded dictionary: https://github.com/titoBouzout/Dictionaries

It's not the latest Version, but works.  :D

Thanks for your support.

Andreas