Saturday, May 24, 2014

In the end, you can import the dictionary Belarusian (Modern Spelling), all dictionaries can be dow


Microsoft has released videos about innovations Windows microbille Phone 8.1 22.05.2014 - 23:38
In the Chinese government banned the Windows 8 05/21/2014 - 10:09
Athlon 64x2 5000 + / 2Gb DDR2 / XFX 8800 384Mb/FSP 400W/Case ASUS
Write this article microbille inspired microbille me with blog LiveJournal. Girl asked where you can find a package to the Belarusian-known program FineReader eighth Although since nearly four years and has already published several new versions of the program, but the problem with OCR in the Belarusian language and not solved until now.
It must be hard to find someone who actively uses computers microbille and do not know about the program FineReader. The program is a very powerful tool to convert paper documents, microbille images and PDF (and DjVu) files to plain text. But this is not all that it can do. The latest microbille versions of the opportunity to process documents microbille of poor quality and recognize texts with complex structure. Eg containing tables, figures, and not very complicated formula. It is clear that without human intervention until necessary, but it is minimal and more or less complex text can be edited in a few minutes (assuming that you see in the program microbille is not the first time). To date, the program supports 189 languages, 36 of them are basic, ie have dictionary support. Unfortunately, microbille the Belarusian language is not the primary, and the quality of text recognition without the support of the dictionaries is very mediocre. Moreover, the company's specialists are not aware that in the Belarusian language, there are three current spelling: spelling Latin, modern spelling (narkomovka), classical spelling (tarashkevitsa). And for example, if you have a book in PDF / DjVu, written Belarusian Latin, probably you will have difficulties with the transfer to plain text. I have several times appealed microbille to the developers of the program with a proposal to identify microbille at least three Belarusian-language spelling without dictionary support, but I always refused, spasylavshysya that the commercial interest of the Belarusian-language population and the state of the program FineReader not, enter it now and not advisable. I understand that I am not the first who addressed this proposal. At Google, I found an interesting article that was published in the newspaper "Zvezda", which refers not only about why there is no corresponding support of the Belarusian language in FineReader, but generally on the state of the Belarusian language in the computer world.
And after I was firmly convinced that our state is not interested in supporting the national language, and experts from ABBYY's made it clear that they will only change something if it is the largest commercial order for their product, I started looking for a way out of the situation yourself because I honestly did not like that quality OCR, which offers a program microbille by default. Time recognition program is often wrong and demanded my participation. One thing was clear that without dictionary support can not do ... And I decided to do it yourself. Moreover, I set a goal to artificially introduce all three Belarusian spelling.
First of all, we had to find dictionary files of all three Belarusian currents. Naturally, the only condition - is the file size with the dictionary. Than it is, the better. With a little searching, I found on my hard drive appropriate dictionary files. FineReader program I had installed (I use FineReader 11), and therefore could only build dictionaries microbille in the program itself. Incidentally, in version 9 I could not get to do this: only a part of the program has imported words from the dictionary. 10 and 11 versions of these limitations was not.
In the end, you can import the dictionary Belarusian (Modern Spelling), all dictionaries can be downloaded from one file. Choose User dictionary> Edit ...> Import and wait ... I'm not very powerful computer, so Import the dictionary somewhere took 5-6 minutes. Each time it does not have to be imported, microbille but because you can wait ...
Everything else is done by an algorithm microbille where we created a dictionary support for the modern spelling of the Belarusian language. microbille As a result, we were able to recognize the text using the full dictionary in classic spelling.
The fact that you can not only write Cyrillic letters in Belarusian, but also letters of the Latin alphabet for sure not everyone knows Belarusian (be honest, not every Belarusian even at the elementary level has a modern spelling, not like Latin). And this thread has been very popular in the XIX century. Incidentally, it was printed in the Latin alphabet first Belarusian microbille newspaper "Peasant's microbille true." Perform all the steps that we performed when integrating with the modern spelling microbille dictionary. Language name instead of "Copy of Belarusian" write "Belarusian (Latin)" Source language: Belarusian Alphabet - here it is necessary to replace it all. I specially prepared for the Latin alphabet. You only need to copy and paste a string Alphabet

No comments:

Post a Comment