Supported Languages for .NET OCR SDK

Popular .NET SDK

.NET XsOCR SDK / C# extracting text for multiple languages and fonts

Developers are able to recognize multiple languages by using OCR SDK for .NET in C# programming, such as English, Latin, French, Chinese, Japanese (kanji), Spanish, German, Hindi, Greek, Italian, Russian, Turkish, Dutch, Portuguese, Thai and Korean.

This online guide shows you how to analyze English and German two language at one time using C# code. Please note, make sure the "eng.traineddata" and "deu.traineddata" language files in the "tessdata" folder.

// Please note:
// If you choose the x64 platform, please copy the "XsOCR_Tesseract.dll" and "XsOCR_Lept.dll"
// from the x64 folder to the same level path which "XsOCR.dll" in. 
// Otherwise, please copy from x86 folder.
// i.e. the "XsOCR.dll" is in "/bin/", the "XsOCR_Tesseract.dll" and "XsOCR_Lept.dll"
// need to be copyed to "/bin/".

// Create an OCR Engine instance
OCREngine engine = new OCREngine();
// Set the absolute path of tessdata
engine.DataPath = "F:/tessdata/";
// Set multiple languages you want to detect and analyze
engine.Language = "eng+deu";            
// Recognize text from image file
string text = engine.DoOCR("F:/sample.jpg");

System.Console.WriteLine(text);

Find and Download all OCR language data from this page.

Notice - If you used the trial version of OCR SDK, the first character of result is symbol "?"

-- Professional OCR Tool --

C# code tutorial

Popular .NET SDK

OCR multi-language supported in C# C# .NET toturial for details for OCR multi-language support and source preparation.

-- Professional OCR Tool --

C# code tutorial

Popular .NET SDK