OCR Server documentation
Chapter 1. Installation1.1. Preliminary requirements
Program is generally compatible with Linux systems on x86 architecture. It should run without problems on all modern GNU/Linux distributions.
Please check if you have tcpserver and tcpclient utilities installed in one of locations included in PATH environment variable. These utilities are from ucspi-tcp package.
Please check if you have curl utility installed in one of locations included in PATH environment variable. Curl is needed for activation process.
You should consider recommendations mentioned above when submitting problems related to the application.
1.2. Program installation
To install OCR Server simply decompress "ocr_server-(version number).tar.bz2" archive file to your destination location.
No more tasks are required.
1.3. Activation
OCR Server will not function without activation.
You should be logged as root to activate application. Please run activate.sh script. OCR Server is activated through the Internet. Internet connection is required. Activation is carried out automatically. Enter your serial number when asked.
Chapter 2. Running2.1. Running the server
In order to run the server, please execute ocr_server.sh script. OCR Server will run on default port number 10000. If you would like to use other port add a command line argument like below:
./ocr_server.sh 10001 2.2. Running the client
In order to run the client, please execute ocr_client.sh script. Two command line arguments are required. First argument is IP address of host with OCR Server running. If port number is not included client will try to connect using default port number 10000. Second argument is path to input file. See examples below:
./ocr_client.sh 127.0.0.1 ~/images/test.jpg ./ocr_client.sh 192.168.0.2:10001 fax_for_ocr.pdf
If you run ocr_client.sh with incorrect command arguments, the following help message will be displayed:
Usage: ocr_client.sh HOST[:PORT] [OPTION ...] FILE
HOST OCR Server host, eg. 127.0.0.1
PORT OCR Server port, default is 10000
-e FORMAT export format: text | rtf | pdf | html | bmp | jpeg | png | tiff_jpeg | tiff_zip
-m LANG_CODE messages language, eg. de or fr
-r LANG_CODE recognition language, eg. de or fr
-w WIDTH image width (for image format export)
-h HEIGHT image height (for image format export) Chapter 3. Protocol3.1. General description
OCR Server and client communicates using TCP/IP protocol. In most cases it will be more practical to use this protocol directly in your application instead of running the client utility.
Commands and responses of the server are simple, self descriptive and human readable.
3.2. Get server version
When this command is sent OCR Server will return its version.
3.3. End session
When this command is sent OCR Server will end current session and break the connection.
3.4. Set protocol version to
When this command is sent OCR Server will change the current protocol to specified version if it is supported.
Currently only version 1 of protocol is supported.
3.5. Set messages language to
When this command is sent OCR Server will change the current messages language to specified one if it is supported.
Messages language is the language that is used to send error and recognition tips messages.
3.6. Set recognition language to
When this command is sent OCR Server will change the current recognition language to specified one if it is supported.
Recognition language is the language that is used to recognize documents. Setting correct recognition language corresponding to documents that will be processed during OCR is very important because OCR Server is using word dictionaries to improve recognition quality.
3.7. Upload document
When this command is sent OCR Server will receive source document file, check its format and process it to get images of all document pages.
3.8. Get number of pages
When this command is sent OCR Server will return number of pages that were retrieved from uploaded document file.
3.9. Set image width to
When this command is sent OCR Server will change the width of image returned by "Get image of page" command.
3.10. Set image height to
When this command is sent OCR Server will change the height of image returned by "Get image of page" command.
3.11. Set image format to
When this command is sent OCR Server will change format of image returned by "Get image of page" command.
The following images formats are supported:
3.11.1. bmp
The BMP file format, sometimes called bitmap, is an image file format used to store bitmap digital images, especially on Microsoft Windows and OS/2 operating systems.
3.11.2. jpeg
The JPEG File Interchange Format (JFIF) is an image file format for exchanging JPEG encoded files compliant with the JPEG Interchange Format (JIF) standard.
3.11.3. png
Portable Network Graphics (PNG) is an image format that employs lossless data compression. PNG was created to improve upon and replace the GIF format, as an image-file format not requiring a patent license.
3.11.4. tiff_jpeg
Tagged Image File Format (abbreviated TIFF) is a file format for storing images, including photographs and line art. This format uses JPEG compression.
3.11.5. tiff_zip
Tagged Image File Format (abbreviated TIFF) is a file format for storing images, including photographs and line art. This format uses ZIP compression.
3.12. Get image of page
When this command is sent OCR Server will send image of specified page.
Default width and height of returned image is retrieved from the source document, but you can change it using "Set image width to" and "Set image height to" commands.
Default format of returned image is PNG but you can change it using "Set image format to" command.
3.13. Recognize page
When this command is sent OCR Server will perform recognition process of specified page.
3.14. Export document as
When this command is sent OCR Server will send results of recognition in specified format.
The following export formats are supported:
3.14.1. plain text
Plain text format is an ideal solution for indexing and searching purposes. It can be edited using text editors.
3.14.2. rtf
Rich text format is an ideal solution for future editing. It preserves document layout and embedded images. It can be edited using most of modern text processors.
3.14.3. pdf
Portable document format is an ideal solution ideal for printing. Editing is currently limited to some specialized, commercial software.
3.14.4. html
HTML format is ideal solution for publishing recognized text on web pages. It can be edited using text editors or webmastering tools.
3.15. Getting details of current license
Every license has a limitation of number of measure units (pages or characters) processed during a period of time (hour, day, week, month or year). In addition there are licenses that have non-renewable limit.
Licenses can be temporary (will expire at specific date or after specific number of days) or permanent.
3.15.1. Get serial number
When this command is sent OCR Server will return serial number of current license.
3.15.2. Get counter measure unit
When this command is sent OCR Server will return measure unit of limitation of current license. It can be "pages" or "characters".
3.15.3. Get limitation period
When this command is sent OCR Server will return time period of limitation of current license. It can be "hour", "day", "week", "month", "year" or "infinite". If it is "infinite" the license have non-renewable limit.
3.15.4. Get units per period
When this command is sent OCR Server will return maximum number of units that can be used in one period of time according to the limitation of current license.
3.15.5. Get remaining units
When this command is sent OCR Server will return remaining number of units that can be used in current period of time according to the limitation of current license.
3.15.6. Get expiration date
When this command is sent OCR Server will return expiration date of current license. Returned date is in ISO format (YYYY-MM-DD) or "none" if the license is permanent.
3.16. Transmitting files
Some commands like "Upload document" or "Export document as" needs a file to be transmitted to or from OCR Server. The protocol includes the possibility to transmit a file and check its correctness.
3.16.1. File header
Each transmitted file begins with a header. Each header consists of three lines. Each line begins with a prefix. Prefix is separated from a value with colon and a space.
You can find a sample file header below:
Extension: jpg
Size: 10244
Checksum: 14514 3.16.2. File data
File data follows the file header directly. This is just a simple stream of bytes and it is not encoded.
3.16.3. Calculating checksum
File checksum is calculated using extended XOR algorithm. It is always 16 bits long. First 8 bits represents all file bytes XORed together. Second 8 bits represents all negated file bytes XORed together. Negating is required to detect lost of bytes with 0 value.
You can find a sample checksum calculating function below (C language):
int calculate_checksum(const char* data, int size)
{
char checksum_1 = 0;
char checksum_2 = 0;
for (int i = 0; i < size; i++)
{
checksum_1 ^= data[i];
checksum_2 ^= (~data[i]);
}
return ((unsigned char)checksum_1 << 8) + (unsigned char)checksum_2;
}
|