OCR Server documentation
Chapter 1. Installation1.1. Preliminary requirements
Program is generally compatible with Linux systems on x86 architecture. It should run without problems on all modern GNU/Linux distributions.
Please check if you have tcpserver and tcpclient utilities installed in one of locations included in PATH environment variable. These utilities are from ucspi-tcp package.
Please check if you have curl utility installed in one of locations included in PATH environment variable. Curl is needed for activation process.
To perform the upgrade xdelta3 utility is required. Please check if the utility is installed in one of the locations included in the PATH environment variable. This tool comes with the package xdelta version 3 or xdelta3 package.
You should consider recommendations mentioned above when submitting problems related to the application.
1.2. Program installation
To install OCR Server simply decompress "ocr_server-(version number).tar.bz2" archive file to your destination location.
To install the program in "upgrade" version the last available full installation version of the program prior version being installed is required. The following command should be executel in the destination location:
XDELTA="-R -c -s ocr_server-(full version number).tar.bz2" \
tar --use-compress-program=xdelta3 -xvf ocr_server-(upgrade version number).tar.xdelta3
No more tasks are required.
1.3. Activation
OCR Server will not function without activation.
You should be logged as root to activate application. Please run activate.sh script. OCR Server is activated through the Internet. Internet connection is required. Activation is carried out automatically. Enter your serial number when asked.
Chapter 2. Running2.1. Running the server
In order to run the server, please execute ocr_server.sh script. OCR Server will run on default port number 10000. If you would like to use other port add a command line argument like below:
./ocr_server.sh 10001
Often there is a need to run the server in the background, so it stays active after the user logs off. This result can be obtained using the command shown below. Any messages printed on the console will be saved to a file "ocr_server.log".
nohup ./ocr_server.sh > ocr_server.log 2>&1 & 2.2. Running the client
In order to run the client, please execute ocr_client.sh script. Two command line arguments are required. First argument is IP address of host with OCR Server running. If port number is not included client will try to connect using default port number 10000. Second argument is path to input file. See examples below:
./ocr_client.sh 127.0.0.1 ~/images/test.jpg ./ocr_client.sh 192.168.0.2:10001 fax_for_ocr.pdf
If you run ocr_client.sh with incorrect command arguments, the following help message will be displayed:
Usage: ocr_client.sh HOST[:PORT] [OPTION ...] FILE
HOST OCR Server host, eg. 127.0.0.1
PORT OCR Server port, default is 10000
-e FORMAT export format: text | rtf | pdf | html | bmp | jpeg | png | tiff_jpeg | tiff_zip
-m LANG_CODE messages language, eg. de or fr
-r LANG_CODE recognition language, eg. de or fr
-w WIDTH image width (for image format export)
-h HEIGHT image height (for image format export) Chapter 3. Supported languages3.1. Languages supported during text recognitionTable 3-1. Languages supported during text recognition and their respective codes | Code | Language name |
|---|
| ab | Abkhaz | | af | Afrikaans | | sq | Albanian | | hy | ArmenianEastern,ArmenianGrabar,ArmenianWestern | | ay | Aymara | | az | AzeriCyrillic,AzeriLatin | | ba | Bashkir | | eu | Basque | | be | Belarusian | | br | Breton | | bg | Bulgarian | | ca | Catalan | | ch | Chamorro | | ce | Chechen | | zh_CN | ChineseSimplified | | zh_TW | ChineseTraditional | | cv | Chuvash | | co | Corsican | | hr | Croatian | | cs | Czech | | da | Danish | | nl_NL | Dutch | | nl_BE | DutchBelgian | | en | English | | eo | Esperanto | | et | Estonian | | fo | Faeroese | | fj | Fijian | | fi | Finnish | | fr | French | | fy | Frisian | | gd | GaelicScottish | | gl | Galician | | lg | Ganda | | de | German | | de_LU | GermanLuxembourg | | el | Greek | | gn | Guarani | | ha | Hausa | | he | Hebrew | | hu | Hungarian | | is | Icelandic | | io | Ido | | id | Indonesian | | ia | Interlingua | | ga | Irish | | it | Italian | | ja | Japanese | | kk | Kazakh | | ki | Kikuyu | | ky | Kirgiz | | kg | Kongo | | ko | Korean | | ku | Kurdish | | la | Latin | | lv | Latvian | | lt | Lithuanian | | lu | Luba | | mk | Macedonian | | mg | Malagasy | | ms | Malay | | mt | Maltese | | mi | Maori | | mo | Moldavian | | mn | Mongol | | nb | NorwegianBokmal | | nn | NorwegianNynorsk | | ny | Nyanja | | ie | Occidental | | oj | Ojibway | | os | Ossetic | | pl | Polish | | pt_BR | PortugueseBrazilian | | pt_PT | PortugueseStandard | | qu | Quechua | | rm | RhaetoRomanic | | ro | Romanian | | rn | Rundi | | ru | Russian | | sm | Samoan | | sr | SerbianCyrillic | | sn | Shona | | sk | Slovak | | sl | Slovenian | | so | Somali | | st | Sotho | | es | Spanish | | su | Sunda | | sw | Swahili | | sv | Swedish | | tl | Tagalog | | ty | Tahitian | | tg | Tajik | | tt | Tatar | | to | Tongan | | tn | Tswana | | tr | Turkish | | tk | Turkmen | | ug | UighurCyrillic,UighurLatin | | uk | Ukrainian | | uz | UzbekCyrillic,UzbekLatin | | cy | Welsh | | wo | Wolof | | xh | Xhosa | | zu | Zulu | | no | Norwegian |
OCR Server engine can support other, not listed in the table, languages during text recognition. They do not, however, have corresponding codes. For support for other languages, please contact the manufacturer or distributor.
3.2. Available languages for server messagesTable 3-2. Available languages for server messages and their respective codes | Code | Language name |
|---|
| en | English | | ru | Russian | | de | German | | fr | French | | es | Spanish | | it | Italian | | nl | DutchStandard | | sv | Swedish | | pt | Portuguese |
Chapter 4. Protocol4.1. General rules
OCR Server and client communicates using TCP/IP protocol. In most cases it will be more practical to use this protocol directly in your application instead of running the client utility.
With few exceptions the exchange of information between the client and the server is in text form. Commands and responses of the server are simple, self descriptive and human readable. Each message is one line of text, and ends with end of line character with code 0x0A ('\n').
At the beginning of dialogue, and after the end of execution of each command the server sends the message "Send a command now." indicating willingness to carry out orders and invitation to send another command. This message is sent regardless of whether the process executed successully as in the example below
< Send a command now.
> Set recognition language to fr
< Done.
< Send a command now.
or with an error as in the example below.
< Send a command now.
> Set recognition language to xx
< Error: One or more arguments are invalid
< Send a command now.
Immediately after client connects the server sends two messages. The first of these is a form of greeting, the other is described above invitation.
< SILVERCODERS OCR Server
< Send a command now.
Typically, immediately after the correct execution of the command server sends the result to the client, as in the example below:
< Send a command now.
> Get serial number
< FFFF-1111-2222-3333-4444
< Send a command now.
or "Done." message, if executed command does not return any result, as in the example below:
< Send a command now.
> Set image format to jpeg
< Done.
< Send a command now. 4.2. Get server version
When this command is sent OCR Server will return its version.
< Send a command now.
> Get server version
< 1.0.3
< Send a command now. 4.3. End session
When this command is sent OCR Server will end current session and break the connection.
< Send a command now.
> End session
< Bye. 4.4. Set protocol version to
When this command is sent OCR Server will change the current protocol to specified version if it is supported.
Currently only version 1 of protocol is supported.
< Send a command now.
> Set protocol version to 1
< Done.
< Send a command now. 4.5. Set messages language to
When this command is sent OCR Server will change the current messages language to specified one if it is supported.
Messages language is the language that is used to send error and recognition tips messages.
< Send a command now.
> Set messages language to fr
< Done.
< Send a command now. 4.6. Set recognition language to
When this command is sent OCR Server will change the current recognition language to specified one if it is supported.
Recognition language is the language that is used to recognize documents. Setting correct recognition language corresponding to documents that will be processed during OCR is very important because OCR Server is using word dictionaries to improve recognition quality.
< Send a command now.
> Set recognition language to fr
< Done.
< Send a command now. 4.7. Upload document
When this command is sent OCR Server will receive source document file, check its format and process it to get images of all document pages.
< Send a command now.
> Upload document
< Send file now.
> Extension: tif
> Size: 25276245
> Checksum: 38421
> --- FILE DATA ---
< Page: 0
> Continue
< Page: 1
> Continue
< Page: 2
> Continue
< Page: 3
> Continue
< Done.
< Send a command now. 4.8. Get number of pages
When this command is sent OCR Server will return number of pages that were retrieved from uploaded document file.
< Send a command now.
> Get number of pages
< 3
< Send a command now. 4.9. Set image width to
When this command is sent OCR Server will change the width of image returned by "Get image of page" command.
< Send a command now.
> Set image width to 300
< Done.
< Send a command now. 4.10. Set image height to
When this command is sent OCR Server will change the height of image returned by "Get image of page" command.
< Send a command now.
> Set image height to 200
< Done.
< Send a command now. 4.11. Set image format to
When this command is sent OCR Server will change format of image returned by "Get image of page" command.
< Send a command now.
> Set image format to jpeg
< Done.
< Send a command now.
The following images formats are supported:
4.11.1. bmp
The BMP file format, sometimes called bitmap, is an image file format used to store bitmap digital images, especially on Microsoft Windows and OS/2 operating systems.
4.11.2. jpeg
The JPEG File Interchange Format (JFIF) is an image file format for exchanging JPEG encoded files compliant with the JPEG Interchange Format (JIF) standard.
4.11.3. png
Portable Network Graphics (PNG) is an image format that employs lossless data compression. PNG was created to improve upon and replace the GIF format, as an image-file format not requiring a patent license.
4.11.4. tiff_jpeg
Tagged Image File Format (abbreviated TIFF) is a file format for storing images, including photographs and line art. This format uses JPEG compression.
4.11.5. tiff_zip
Tagged Image File Format (abbreviated TIFF) is a file format for storing images, including photographs and line art. This format uses ZIP compression.
4.12. Enable page breaks in plain text
When this command is sent OCR Server will enable page breaks in plain text returned by "Export document as plain text" command. The form feed character code (0xC in hexadecimal) is used to separate pages.
< Send a command now.
> Enable page breaks in plain text
< Done.
< Send a command now.
This command requires server in version 1.0.2 or later.
4.13. Get image of page
When this command is sent OCR Server will send image of specified page.
< Send a command now.
> Get image of page 0
< Done.
< Extension: jpg
< Size: 144628
< Checksum: 33381
< --- DANE PLIKU ---
< Send a command now.
Default width and height of returned image is retrieved from the source document, but you can change it using "Set image width to" and "Set image height to" commands.
Default format of returned image is PNG but you can change it using "Set image format to" command.
4.14. Recognize page
When this command is sent OCR Server will perform recognition process of specified page.
< Send a command now.
> Recognize page 0
< Progress: 0
> Continue
< Progress: 0
> Continue
< Progress: 0
...
< Progress: 1
> Continue
< Progress: 19
> Continue
< Rect: 2376 134 2526 222
> Continue
< Tip: Increase resolution to improve recognition accuracy of small text.
> Continue
< Progress: 19
> Continue
< Rect: 332 372 480 418
...
< Progress: 100
> Continue
< Done.
< Send a command now. 4.15. Export document as
When this command is sent OCR Server will send results of recognition in specified format.
< Send a command now.
> Export document as plain text
< Progress: 0
> Continue
< Progress: 0
> Continue
< Progress: 25
> Continue
< Progress: 50
> Continue
< Progress: 75
> Continue
< Progress: 100
> Continue
< Done.
< Extension: txt
< Size: 8533
< Checksum: 46941
< --- FILE DATA ---
< Send a command now.
The following export formats are supported:
4.15.1. plain text
Plain text format is an ideal solution for indexing and searching purposes. It can be edited using text editors.
4.15.2. rtf
Rich text format is an ideal solution for future editing. It preserves document layout and embedded images. It can be edited using most of modern text processors.
4.15.3. pdf
Portable document format is an ideal solution ideal for printing. Editing is currently limited to some specialized, commercial software.
4.15.4. html
HTML format is ideal solution for publishing recognized text on web pages. It can be edited using text editors or webmastering tools.
4.16. Getting details of current license
Every license has a limitation of number of measure units (pages or characters) processed during a period of time (hour, day, week, month or year). In addition there are licenses that have non-renewable limit.
Licenses can be temporary (will expire at specific date or after specific number of days) or permanent.
4.16.1. Get serial number
When this command is sent OCR Server will return serial number of current license.
< Send a command now.
> Get serial number
< FFFF-1111-2222-3333-4444
< Send a command now. 4.16.2. Get counter measure unit
When this command is sent OCR Server will return measure unit of limitation of current license. It can be "pages" or "characters".
< Send a command now.
> Get counter measure unit
< pages
< Send a command now. 4.16.3. Get limitation period
When this command is sent OCR Server will return time period of limitation of current license. It can be "hour", "day", "week", "month", "year" or "infinite". If it is "infinite" the license have non-renewable limit.
< Send a command now.
> Get limitation period
< month
< Send a command now. 4.16.4. Get units per period
When this command is sent OCR Server will return maximum number of units that can be used in one period of time according to the limitation of current license.
< Send a command now.
> Get units per period
< 10000
< Send a command now. 4.16.5. Get remaining units
When this command is sent OCR Server will return remaining number of units that can be used in current period of time according to the limitation of current license.
< Send a command now.
> Get remaining units
< 9997
< Send a command now. 4.16.6. Get expiration date
When this command is sent OCR Server will return expiration date of current license. Returned date is in ISO format (YYYY-MM-DD) or "none" if the license is permanent.
< Send a command now.
> Get expiration date
< none
< Send a command now. 4.17. Transmitting files
Some commands like "Upload document" or "Export document as" needs a file to be transmitted to or from OCR Server. The protocol includes the possibility to transmit a file and check its correctness.
4.17.1. File header
Each transmitted file begins with a header. Each header consists of three lines. Each line begins with a prefix. Prefix is separated from a value with colon and a space.
You can find a sample file header below:
Extension: jpg
Size: 10244
Checksum: 14514 4.17.2. File data
File data follows the file header directly. This is just a simple stream of bytes and it is not encoded.
4.17.3. Calculating checksum
File checksum is calculated using extended XOR algorithm. It is always 16 bits long. First 8 bits represents all file bytes XORed together. Second 8 bits represents all negated file bytes XORed together. Negating is required to detect lost of bytes with 0 value.
You can find a sample checksum calculating function below (C language):
int calculate_checksum(const char* data, int size)
{
char checksum_1 = 0;
char checksum_2 = 0;
for (int i = 0; i < size; i++)
{
checksum_1 ^= data[i];
checksum_2 ^= (~data[i]);
}
return ((unsigned char)checksum_1 << 8) + (unsigned char)checksum_2;
}4.18. Reporting and controlling progress of operation
Some commands like "Recognize page" or "Export document as" begin process that can be quite time consuming. To guarantee that user friendly client applications can be implemented, the protocol includes the possibility to report progress of this kind of operations and to cancel them on demand.
4.18.1. Reporting progress in percentage
OCR Server reports progress of time consuming operations in percentage. Percentage value is an integer in range 0..100. Each value can be reported more than once.
< Progress: 0
> Continue
< Progress: 0
> Continue
< Progress: 25
> Continue
< Progress: 50
> Continue
This feature is often used to implement a progress bar in the client application.
4.18.2. Reporting current page
During time consuming multipage operations OCR Server reports number of current page. Number of page is an integer greater or equal to 0. Each value can be reported only once.
< Page: 0
> Continue
< Page: 1
> Continue
< Page: 2
> Continue
This feature is often used to show operation progress in the client application.
4.18.3. Reporting current image region
During time consuming multiregion operations OCR Server reports coordinates of current region. Coordinates of current region consists of four (left, top, right, bottom corner) integer values separated by spaces. Value unit is a pixel.
< Progress: 19
> Continue
< Rect: 2376 134 2526 222
> Continue
This feature is often used to mark region being processed on document preview in the client application.
4.18.4. Recognition tips
OCR Server reports important tips that can improve recognition quality. This kind of messages should be presented to end user in the client application.
< Progress: 19
> Continue
< Tip: Increase resolution to improve recognition accuracy of small text.
> Continue
< Progress: 19
> Continue 4.18.5. Continue or cancel the process
After sending progress information or recognition tip OCR Server waits for client application to send "Continue" command. This command confirms that user interface was updated, end user did not cancel the operation and recognition process can continue.
Instead of confirmation the "Cancel" command can be send to OCR Server. This command asks server to abort current operation.
< Progress: 0
> Continue
< Progress: 1
> Cancel
< Error: The operation was canceled.
< Send a command now.
This feature is often used to implement "Cancel" button in client application.
4.18.6. Reporting that the process is complete
When the process finished successfully OCR Server sends the "Done." message.
< Progress: 99
> Continue
< Progress: 100
> Continue
< Done.
< Send a command now. 4.19. Reporting errors
There are two types of errors, the occurrence of which can be reported by OCR Server. Standard errors relate only to operations currently performed, and after sending the relevant information, the server is ready to perform other commands. After a critical error occurred, the server terminates the connection and exits.
OCR Server indicates standard errors by sending a message that begins with the expression "Error:". Fatal errors are signaled by a message starting with the expression "Fatal error:".
< Send a command now.
> Set recognition language to xx
< Error: One or more arguments are invalid
|