SILVERCODERS®
Quality and competence
rzetelna firma

In English Po polsku

Website uses cookies.
You can disable them.

OCR Server documentation

Chapter 1. Installation

1.1. Preliminary requirements

Program is generally compatible with Linux systems on x86 architecture. It should run without problems on all modern GNU/Linux distributions.

Please check if you have tcpserver and tcpclient utilities installed in one of locations included in PATH environment variable. These utilities are from ucspi-tcp package.

Please check if you have curl utility installed in one of locations included in PATH environment variable. Curl is needed for activation process.

To perform the upgrade xdelta3 utility is required. Please check if the utility is installed in one of the locations included in the PATH environment variable. This tool comes with the package xdelta version 3 or xdelta3 package.

You should consider recommendations mentioned above when submitting problems related to the application.

1.2. Program installation

To install OCR Server simply decompress "ocr_server-(version number).tar.bz2" archive file to your destination location.

To install the program in "upgrade" version the last available full installation version of the program prior version being installed is required. The following command should be executel in the destination location:

XDELTA="-R -c -s ocr_server-(full version number).tar.bz2" \
	tar --use-compress-program=xdelta3 -xvf ocr_server-(upgrade version number).tar.xdelta3

No more tasks are required.

1.3. Activation

OCR Server will not function without activation.

You should be logged as root to activate application. Please run activate.sh script. OCR Server is activated through the Internet. Internet connection is required. Activation is carried out automatically. Enter your serial number when asked.

Chapter 2. Running

2.1. Running the server

In order to run the server, please execute ocr_server.sh script. OCR Server will run on default port number 10000. If you would like to use other port add a command line argument like below:

./ocr_server.sh 10001

Often there is a need to run the server in the background, so it stays active after the user logs off. This result can be obtained using the command shown below. Any messages printed on the console will be saved to a file "ocr_server.log".

nohup ./ocr_server.sh > ocr_server.log 2>&1 &

2.2. Running the client

In order to run the client, please execute ocr_client.sh script. Two command line arguments are required. First argument is IP address of host with OCR Server running. If port number is not included client will try to connect using default port number 10000. Second argument is path to input file. See examples below:

./ocr_client.sh 127.0.0.1 ~/images/test.jpg
./ocr_client.sh 192.168.0.2:10001 fax_for_ocr.pdf

If you run ocr_client.sh with incorrect command arguments, the following help message will be displayed:

Usage: ocr_client.sh HOST[:PORT] [OPTION ...] FILE
HOST            OCR Server host, eg. 127.0.0.1
PORT            OCR Server port, default is 10000
-e FORMAT       export format: text | rtf | pdf | html | bmp | jpeg | png | tiff_jpeg | tiff_zip
-m LANG_CODE    messages language, eg. de or fr
-r LANG_CODE    recognition language, eg. de or fr
-w WIDTH        image width (for image format export)
-h HEIGHT       image height (for image format export)

Chapter 3. Supported languages

3.1. Languages supported during text recognition

Table 3-1. Languages supported during text recognition and their respective codes

CodeLanguage name
abAbkhaz
afAfrikaans
sqAlbanian
hyArmenianEastern,ArmenianGrabar,ArmenianWestern
ayAymara
azAzeriCyrillic,AzeriLatin
baBashkir
euBasque
beBelarusian
brBreton
bgBulgarian
caCatalan
chChamorro
ceChechen
zh_CNChineseSimplified
zh_TWChineseTraditional
cvChuvash
coCorsican
hrCroatian
csCzech
daDanish
nl_NLDutch
nl_BEDutchBelgian
enEnglish
eoEsperanto
etEstonian
foFaeroese
fjFijian
fiFinnish
frFrench
fyFrisian
gdGaelicScottish
glGalician
lgGanda
deGerman
de_LUGermanLuxembourg
elGreek
gnGuarani
haHausa
heHebrew
huHungarian
isIcelandic
ioIdo
idIndonesian
iaInterlingua
gaIrish
itItalian
jaJapanese
kkKazakh
kiKikuyu
kyKirgiz
kgKongo
koKorean
kuKurdish
laLatin
lvLatvian
ltLithuanian
luLuba
mkMacedonian
mgMalagasy
msMalay
mtMaltese
miMaori
moMoldavian
mnMongol
nbNorwegianBokmal
nnNorwegianNynorsk
nyNyanja
ieOccidental
ojOjibway
osOssetic
plPolish
pt_BRPortugueseBrazilian
pt_PTPortugueseStandard
quQuechua
rmRhaetoRomanic
roRomanian
rnRundi
ruRussian
smSamoan
srSerbianCyrillic
snShona
skSlovak
slSlovenian
soSomali
stSotho
esSpanish
suSunda
swSwahili
svSwedish
tlTagalog
tyTahitian
tgTajik
ttTatar
toTongan
tnTswana
trTurkish
tkTurkmen
ugUighurCyrillic,UighurLatin
ukUkrainian
uzUzbekCyrillic,UzbekLatin
cyWelsh
woWolof
xhXhosa
zuZulu
noNorwegian

OCR Server engine can support other, not listed in the table, languages during text recognition. They do not, however, have corresponding codes. For support for other languages, please contact the manufacturer or distributor.

3.2. Available languages for server messages

Table 3-2. Available languages for server messages and their respective codes

CodeLanguage name
enEnglish
ruRussian
deGerman
frFrench
esSpanish
itItalian
nlDutchStandard
svSwedish
ptPortuguese

Chapter 4. Protocol

4.1. General rules

OCR Server and client communicates using TCP/IP protocol. In most cases it will be more practical to use this protocol directly in your application instead of running the client utility.

With few exceptions the exchange of information between the client and the server is in text form. Commands and responses of the server are simple, self descriptive and human readable. Each message is one line of text, and ends with end of line character with code 0x0A ('\n').

At the beginning of dialogue, and after the end of execution of each command the server sends the message "Send a command now." indicating willingness to carry out orders and invitation to send another command. This message is sent regardless of whether the process executed successully as in the example below

< Send a command now.
> Set recognition language to fr
< Done.
< Send a command now.

or with an error as in the example below.

< Send a command now.
> Set recognition language to xx
< Error: One or more arguments are invalid
< Send a command now.

Immediately after client connects the server sends two messages. The first of these is a form of greeting, the other is described above invitation.

< SILVERCODERS OCR Server
< Send a command now.

Typically, immediately after the correct execution of the command server sends the result to the client, as in the example below:

< Send a command now.
> Get serial number
< FFFF-1111-2222-3333-4444
< Send a command now.

or "Done." message, if executed command does not return any result, as in the example below:

< Send a command now.
> Set image format to jpeg
< Done.
< Send a command now.

4.2. Get server version

When this command is sent OCR Server will return its version.

< Send a command now.
> Get server version
< 1.0.3
< Send a command now.

4.3. End session

When this command is sent OCR Server will end current session and break the connection.

< Send a command now.
> End session
< Bye.

4.4. Set protocol version to

When this command is sent OCR Server will change the current protocol to specified version if it is supported.

Currently only version 1 of protocol is supported.

< Send a command now.
> Set protocol version to 1
< Done.
< Send a command now.

4.5. Set messages language to

When this command is sent OCR Server will change the current messages language to specified one if it is supported.

Messages language is the language that is used to send error and recognition tips messages.

< Send a command now.
> Set messages language to fr
< Done.
< Send a command now.

4.6. Set recognition language to

When this command is sent OCR Server will change the current recognition language to specified one if it is supported.

Recognition language is the language that is used to recognize documents. Setting correct recognition language corresponding to documents that will be processed during OCR is very important because OCR Server is using word dictionaries to improve recognition quality.

< Send a command now.
> Set recognition language to fr
< Done.
< Send a command now.

4.7. Upload document

When this command is sent OCR Server will receive source document file, check its format and process it to get images of all document pages.

< Send a command now.
> Upload document
< Send file now.
> Extension: tif
> Size: 25276245
> Checksum: 38421
> --- FILE DATA ---
< Page: 0
> Continue
< Page: 1
> Continue
< Page: 2
> Continue
< Page: 3
> Continue
< Done.
< Send a command now.

4.8. Get number of pages

When this command is sent OCR Server will return number of pages that were retrieved from uploaded document file.

< Send a command now.
> Get number of pages
< 3
< Send a command now.

4.9. Set image width to

When this command is sent OCR Server will change the width of image returned by "Get image of page" command.

< Send a command now.
> Set image width to 300
< Done.
< Send a command now.

4.10. Set image height to

When this command is sent OCR Server will change the height of image returned by "Get image of page" command.

< Send a command now.
> Set image height to 200
< Done.
< Send a command now.

4.11. Set image format to

When this command is sent OCR Server will change format of image returned by "Get image of page" command.

< Send a command now.
> Set image format to jpeg
< Done.
< Send a command now.

The following images formats are supported:

4.11.1. bmp

The BMP file format, sometimes called bitmap, is an image file format used to store bitmap digital images, especially on Microsoft Windows and OS/2 operating systems.

4.11.2. jpeg

The JPEG File Interchange Format (JFIF) is an image file format for exchanging JPEG encoded files compliant with the JPEG Interchange Format (JIF) standard.

4.11.3. png

Portable Network Graphics (PNG) is an image format that employs lossless data compression. PNG was created to improve upon and replace the GIF format, as an image-file format not requiring a patent license.

4.11.4. tiff_jpeg

Tagged Image File Format (abbreviated TIFF) is a file format for storing images, including photographs and line art. This format uses JPEG compression.

4.11.5. tiff_zip

Tagged Image File Format (abbreviated TIFF) is a file format for storing images, including photographs and line art. This format uses ZIP compression.

4.12. Enable page breaks in plain text

When this command is sent OCR Server will enable page breaks in plain text returned by "Export document as plain text" command. The form feed character code (0xC in hexadecimal) is used to separate pages.

< Send a command now.
> Enable page breaks in plain text
< Done.
< Send a command now.

This command requires server in version 1.0.2 or later.

4.13. Get image of page

When this command is sent OCR Server will send image of specified page.

< Send a command now.
> Get image of page 0
< Done.
< Extension: jpg
< Size: 144628
< Checksum: 33381
< --- DANE PLIKU ---
< Send a command now.

Default width and height of returned image is retrieved from the source document, but you can change it using "Set image width to" and "Set image height to" commands.

Default format of returned image is PNG but you can change it using "Set image format to" command.

4.14. Recognize page

When this command is sent OCR Server will perform recognition process of specified page.

< Send a command now.
> Recognize page 0
< Progress: 0
> Continue
< Progress: 0
> Continue
< Progress: 0
...
< Progress: 1
> Continue
< Progress: 19
> Continue
< Rect: 2376 134 2526 222
> Continue
< Tip: Increase resolution to improve recognition accuracy of small text.
> Continue
< Progress: 19
> Continue
< Rect: 332 372 480 418
...
< Progress: 100
> Continue
< Done.
< Send a command now.

4.15. Export document as

When this command is sent OCR Server will send results of recognition in specified format.

< Send a command now.
> Export document as plain text
< Progress: 0
> Continue
< Progress: 0
> Continue
< Progress: 25
> Continue
< Progress: 50
> Continue
< Progress: 75
> Continue
< Progress: 100
> Continue
< Done.
< Extension: txt
< Size: 8533
< Checksum: 46941
< --- FILE DATA ---
< Send a command now.

The following export formats are supported:

4.15.1. plain text

Plain text format is an ideal solution for indexing and searching purposes. It can be edited using text editors.

4.15.2. rtf

Rich text format is an ideal solution for future editing. It preserves document layout and embedded images. It can be edited using most of modern text processors.

4.15.3. pdf

Portable document format is an ideal solution ideal for printing. Editing is currently limited to some specialized, commercial software.

4.15.4. html

HTML format is ideal solution for publishing recognized text on web pages. It can be edited using text editors or webmastering tools.

4.16. Getting details of current license

Every license has a limitation of number of measure units (pages or characters) processed during a period of time (hour, day, week, month or year). In addition there are licenses that have non-renewable limit.

Licenses can be temporary (will expire at specific date or after specific number of days) or permanent.

4.16.1. Get serial number

When this command is sent OCR Server will return serial number of current license.

< Send a command now.
> Get serial number
< FFFF-1111-2222-3333-4444
< Send a command now.

4.16.2. Get counter measure unit

When this command is sent OCR Server will return measure unit of limitation of current license. It can be "pages" or "characters".

< Send a command now.
> Get counter measure unit
< pages
< Send a command now.

4.16.3. Get limitation period

When this command is sent OCR Server will return time period of limitation of current license. It can be "hour", "day", "week", "month", "year" or "infinite". If it is "infinite" the license have non-renewable limit.

< Send a command now.
> Get limitation period
< month
< Send a command now.

4.16.4. Get units per period

When this command is sent OCR Server will return maximum number of units that can be used in one period of time according to the limitation of current license.

< Send a command now.
> Get units per period
< 10000
< Send a command now.

4.16.5. Get remaining units

When this command is sent OCR Server will return remaining number of units that can be used in current period of time according to the limitation of current license.

< Send a command now.
> Get remaining units
< 9997
< Send a command now.

4.16.6. Get expiration date

When this command is sent OCR Server will return expiration date of current license. Returned date is in ISO format (YYYY-MM-DD) or "none" if the license is permanent.

< Send a command now.
> Get expiration date
< none
< Send a command now.

4.17. Transmitting files

Some commands like "Upload document" or "Export document as" needs a file to be transmitted to or from OCR Server. The protocol includes the possibility to transmit a file and check its correctness.

4.17.1. File header

Each transmitted file begins with a header. Each header consists of three lines. Each line begins with a prefix. Prefix is separated from a value with colon and a space.

You can find a sample file header below:

Extension: jpg
Size: 10244
Checksum: 14514

4.17.2. File data

File data follows the file header directly. This is just a simple stream of bytes and it is not encoded.

4.17.3. Calculating checksum

File checksum is calculated using extended XOR algorithm. It is always 16 bits long. First 8 bits represents all file bytes XORed together. Second 8 bits represents all negated file bytes XORed together. Negating is required to detect lost of bytes with 0 value.

You can find a sample checksum calculating function below (C language):

int calculate_checksum(const char* data, int size)
{
	char checksum_1 = 0;
	char checksum_2 = 0;
	for (int i = 0; i < size; i++)
	{
		checksum_1 ^= data[i];
		checksum_2 ^= (~data[i]);
	}
	return ((unsigned char)checksum_1 << 8) + (unsigned char)checksum_2;
}

4.18. Reporting and controlling progress of operation

Some commands like "Recognize page" or "Export document as" begin process that can be quite time consuming. To guarantee that user friendly client applications can be implemented, the protocol includes the possibility to report progress of this kind of operations and to cancel them on demand.

4.18.1. Reporting progress in percentage

OCR Server reports progress of time consuming operations in percentage. Percentage value is an integer in range 0..100. Each value can be reported more than once.

< Progress: 0
> Continue
< Progress: 0
> Continue
< Progress: 25
> Continue
< Progress: 50
> Continue

This feature is often used to implement a progress bar in the client application.

4.18.2. Reporting current page

During time consuming multipage operations OCR Server reports number of current page. Number of page is an integer greater or equal to 0. Each value can be reported only once.

< Page: 0
> Continue
< Page: 1
> Continue
< Page: 2
> Continue

This feature is often used to show operation progress in the client application.

4.18.3. Reporting current image region

During time consuming multiregion operations OCR Server reports coordinates of current region. Coordinates of current region consists of four (left, top, right, bottom corner) integer values separated by spaces. Value unit is a pixel.

< Progress: 19
> Continue
< Rect: 2376 134 2526 222
> Continue

This feature is often used to mark region being processed on document preview in the client application.

4.18.4. Recognition tips

OCR Server reports important tips that can improve recognition quality. This kind of messages should be presented to end user in the client application.

< Progress: 19
> Continue
< Tip: Increase resolution to improve recognition accuracy of small text.
> Continue
< Progress: 19
> Continue

4.18.5. Continue or cancel the process

After sending progress information or recognition tip OCR Server waits for client application to send "Continue" command. This command confirms that user interface was updated, end user did not cancel the operation and recognition process can continue.

Instead of confirmation the "Cancel" command can be send to OCR Server. This command asks server to abort current operation.

< Progress: 0
> Continue
< Progress: 1
> Cancel
< Error: The operation was canceled.
< Send a command now.

This feature is often used to implement "Cancel" button in client application.

4.18.6. Reporting that the process is complete

When the process finished successfully OCR Server sends the "Done." message.

< Progress: 99
> Continue
< Progress: 100
> Continue
< Done.
< Send a command now.

4.19. Reporting errors

There are two types of errors, the occurrence of which can be reported by OCR Server. Standard errors relate only to operations currently performed, and after sending the relevant information, the server is ready to perform other commands. After a critical error occurred, the server terminates the connection and exits.

OCR Server indicates standard errors by sending a message that begins with the expression "Error:". Fatal errors are signaled by a message starting with the expression "Fatal error:".

< Send a command now.
> Set recognition language to xx
< Error: One or more arguments are invalid