public class TessBaseAPI extends Object
Modifier and Type | Class and Description |
---|---|
static interface |
TessBaseAPI.OcrEngineMode |
static class |
TessBaseAPI.PageIteratorLevel
Elements of the page hierarchy, used in
ResultIterator to provide
functions that operate on each level without having to have 5x as many
functions. |
static class |
TessBaseAPI.PageSegMode
Page segmentation mode.
|
static interface |
TessBaseAPI.ProgressNotifier
Interface that may be implemented by calling object in order to receive
progress callbacks during OCR.
|
class |
TessBaseAPI.ProgressValues
Represents values indicating recognition progress and status.
|
Modifier and Type | Field and Description |
---|---|
static int |
OEM_CUBE_ONLY
Deprecated.
|
static int |
OEM_DEFAULT
Default OCR engine mode.
|
static int |
OEM_TESSERACT_CUBE_COMBINED
Deprecated.
|
static int |
OEM_TESSERACT_ONLY
Run Tesseract only - fastest
|
static String |
VAR_CHAR_BLACKLIST
Blacklist of characters to not recognize.
|
static String |
VAR_CHAR_WHITELIST
Whitelist of characters to recognize.
|
static String |
VAR_FALSE
String value used to assign a boolean variable to false.
|
static String |
VAR_SAVE_BLOB_CHOICES
Save blob choices allowing us to get alternative results.
|
static String |
VAR_TRUE
String value used to assign a boolean variable to true.
|
Constructor and Description |
---|
TessBaseAPI()
Constructs an instance of TessBaseAPI.
|
TessBaseAPI(TessBaseAPI.ProgressNotifier progressNotifier)
Constructs an instance of TessBaseAPI with a callback method for
receiving progress updates during OCR.
|
Modifier and Type | Method and Description |
---|---|
boolean |
addPageToDocument(Pix imageToProcess,
String imageToWrite,
TessPdfRenderer tessPdfRenderer)
Adds the given data to the opened document (if any).
|
boolean |
beginDocument(TessPdfRenderer tessPdfRenderer)
Starts a new document with no title.
|
boolean |
beginDocument(TessPdfRenderer tessPdfRenderer,
String title)
Starts a new document.
|
void |
clear()
Frees up recognition results and any stored image data, without actually
freeing any recognition data that would be time-consuming to reload.
|
void |
end()
Closes down tesseract and free up all memory.
|
boolean |
endDocument(TessPdfRenderer tessPdfRenderer)
Finishes the document and finalizes the output data.
|
String |
getBoxText(int page)
The recognized text is returned as coded in the same format as a UTF8
box file used in training.
|
Pixa |
getConnectedComponents()
Gets the individual connected (text) components (created after pages
segmentation step, but before recognition) as a Pixa, in reading order.
|
String |
getHOCRText(int page)
Make a HTML-formatted string with hOCR markup from the internal data
structures.
|
String |
getInitLanguagesAsString()
Returns the languages string used in the last valid initialization.
|
int |
getPageSegMode()
Return the current page segmentation mode.
|
Pixa |
getRegions()
Returns the result of page layout analysis as a Pixa, in reading order.
|
ResultIterator |
getResultIterator()
Get a reading-order iterator to the results of LayoutAnalysis and/or
Recognize.
|
Pixa |
getStrips()
Get textlines and strips of image regions as a Pixa, in reading order.
|
Pixa |
getTextlines()
Returns the textlines as a Pixa.
|
Pix |
getThresholdedImage()
Get a copy of the internal thresholded image from Tesseract.
|
String |
getUTF8Text()
The recognized text is returned as a String which is coded as UTF8.
|
String |
getVersion()
Returns the version identifier as a string.
|
Pixa |
getWords()
Get the words as a Pixa, in reading order.
|
boolean |
init(String datapath,
String language)
Initializes the Tesseract engine with a specified language model.
|
boolean |
init(String datapath,
String language,
int ocrEngineMode)
Initializes the Tesseract engine with the specified language model(s).
|
int |
meanConfidence()
Returns the (average) confidence value between 0 and 100.
|
protected void |
onProgressValues(int percent,
int left,
int right,
int top,
int bottom,
int textLeft,
int textRight,
int textTop,
int textBottom)
Called from native code to update progress of ongoing recognition passes.
|
void |
readConfigFile(String filename)
Read a "config" file containing a set of variable, value pairs.
|
void |
setDebug(boolean enabled)
Sets debug mode.
|
void |
setImage(Bitmap bmp)
Provides an image for Tesseract to recognize.
|
void |
setImage(byte[] imagedata,
int width,
int height,
int bpp,
int bpl)
Provides an image for Tesseract to recognize.
|
void |
setImage(File file)
Provides an image for Tesseract to recognize.
|
void |
setImage(Pix image)
Provides a Leptonica pix format image for Tesseract to recognize.
|
void |
setInputName(String name)
Set the name of the input file.
|
void |
setOutputName(String name)
Set the name of the bonus output files.
|
void |
setPageSegMode(int mode)
Sets the page segmentation mode.
|
void |
setRectangle(int left,
int top,
int width,
int height)
Restricts recognition to a sub-rectangle of the image.
|
void |
setRectangle(Rect rect)
Restricts recognition to a sub-rectangle of the image.
|
boolean |
setVariable(String var,
String value)
Set the value of an internal "parameter."
|
void |
stop()
Cancel recognition started by
getHOCRText(int) . |
int[] |
wordConfidences()
Returns all word confidences (between 0 and 100) in an array.
|
public static final String VAR_CHAR_WHITELIST
public static final String VAR_CHAR_BLACKLIST
public static final String VAR_SAVE_BLOB_CHOICES
public static final String VAR_TRUE
public static final String VAR_FALSE
public static final int OEM_TESSERACT_ONLY
@Deprecated public static final int OEM_CUBE_ONLY
@Deprecated public static final int OEM_TESSERACT_CUBE_COMBINED
public static final int OEM_DEFAULT
public TessBaseAPI()
When the instance of TessBaseAPI is no longer needed, its end()
method must be invoked to dispose of it.
public TessBaseAPI(TessBaseAPI.ProgressNotifier progressNotifier)
When the instance of TessBaseAPI is no longer needed, its end()
method must be invoked to dispose of it.
progressNotifier
- Callback to receive progress notificationspublic boolean init(String datapath, String language)
true
on success.
Instances are now mostly thread-safe and totally independent, but some global parameters remain. Basically it is safe to use multiple TessBaseAPIs in different threads in parallel, UNLESS you use SetVariable on some of the Params in classify and textord. If you do, then the effect will be to change it for all your instances.
The datapath must be the name of the parent directory of tessdata and
must end in / . Any name after the last / will be stripped. The language
is (usually) an ISO 639-3 string or null
will default to eng.
It is entirely safe (and eventually will be efficient too) to call Init
multiple times on the same instance to change language, or just to reset
the classifier.
The language may be a string of the form [~]<lang>[+[~]<lang>]*
indicating
that multiple languages are to be loaded. Eg hin+eng will load Hindi and
English. Languages may specify internally that they want to be loaded
with one or more other languages, so the ~ sign is available to override
that. Eg if hin were set to load eng by default, then hin+~eng would force
loading only hin. The number of loaded languages is limited only by
memory, with the caveat that loading additional languages will impact
both speed and accuracy, as there is more work to do to decide on the
applicable language, and there is more chance of hallucinating incorrect
words.
WARNING: On changing languages, all Tesseract parameters are reset back to their default values. (Which may vary between languages.)
If you have a rare need to set a Variable that controls initialization for a second call to Init you should explicitly call End() and then use SetVariable before Init. This is only a very rare use case, since there are very few uses that require any parameters to be set before Init.
datapath
- the parent directory of tessdata ending in a forward
slashlanguage
- an ISO 639-3 string representing the language(s)true
on successpublic boolean init(String datapath, String language, int ocrEngineMode)
true
on success.datapath
- the parent directory of tessdata ending in a forward
slashlanguage
- an ISO 639-3 string representing the language(s)ocrEngineMode
- the OCR engine mode to be settrue
on successinit(String, String)
public String getInitLanguagesAsString()
public void clear()
public void end()
Once End() has been used, none of the other API functions may be used other than Init and anything declared above it in the class definition.
public boolean setVariable(String var, String value)
Supply the name of the parameter and the value as a string, just as you would in a config file.
Returns false if the name lookup failed.
Eg setVariable("tessedit_char_blacklist", "xyz");
to
ignore x, y and z.
Or setVariable("classify_bln_numeric_mode", "1");
to set
numeric-only mode.
setVariable may be used before init, but settings will revert to defaults on end().
Note: Must be called after init(). Only works for non-init variables.
var
- name of the variablevalue
- value to setpublic int getPageSegMode()
public void setPageSegMode(int mode)
TessBaseAPI.PageSegMode.PSM_SINGLE_BLOCK
. This controls how much processing
the OCR engine will perform before recognizing text.
The mode can also be modified by readConfigFile or setVariable("tessedit_pageseg_mode", mode as string).
mode
- the TessBaseAPI.PageSegMode
to setpublic void setDebug(boolean enabled)
enabled
- true
to enable debugging modepublic void setRectangle(Rect rect)
rect
- the bounding rectanglepublic void setRectangle(int left, int top, int width, int height)
left
- the left boundtop
- the right boundwidth
- the width of the bounding boxheight
- the height of the bounding boxpublic void setImage(File file)
file
- absolute path to the image filepublic void setImage(Bitmap bmp)
bmp
- bitmap representation of the imagepublic void setImage(Pix image)
image
- Leptonica pix representation of the imagepublic void setImage(byte[] imagedata, int width, int height, int bpp, int bpl)
imagedata
- byte representation of the imagewidth
- image widthheight
- image heightbpp
- bytes per pixelbpl
- bytes per linepublic String getUTF8Text()
stop()
.
Call getHOCRText(int)
before calling this function to
interrupt a recognition task with stop()
public int meanConfidence()
public int[] wordConfidences()
The number of confidences should correspond to the number of space-delimited words in GetUTF8Text().
public Pix getThresholdedImage()
Caller takes ownership of the Pix and must recycle() it. May be called any time after setImage.
public Pixa getRegions()
Can be called before or after Recognize.
public Pixa getTextlines()
Can be called before or after Recognize. Block IDs are not returned. Paragraph IDs are not returned.
public Pixa getStrips()
Enables downstream handling of non-rectangular regions. Can be called before or after Recognize. Block IDs are not returned.
public Pixa getWords()
Can be called before or after Recognize.
public Pixa getConnectedComponents()
Can be called before or after Recognize. Note: the caller is responsible for calling recycle() on the returned Pixa.
public ResultIterator getResultIterator()
public String getHOCRText(int page)
stop()
.page
- is 0-based but will appear in the output as 1-based.public void setInputName(String name)
name
- input file namepublic void setOutputName(String name)
name
- output file namepublic void readConfigFile(String filename)
Searches the standard places: tessdata/configs, tessdata/tessconfigs. Note: only non-init params will be set.
filename
- the configuration filename, without the pathpublic String getBoxText(int page)
Constructs coordinates in the original image - not just the rectangle.
page
- a 0-based page index that will appear in the box file.public String getVersion()
public void stop()
getHOCRText(int)
.protected void onProgressValues(int percent, int left, int right, int top, int bottom, int textLeft, int textRight, int textTop, int textBottom)
percent
- Percent completeleft
- Left bound of word bounding boxright
- Right bound of word bounding boxtop
- Top bound of word bounding boxbottom
- Bottom bound of word bounding boxtextLeft
- Left bound of text bounding boxtextRight
- Right bound of text bounding boxtextTop
- Top bound of text bounding boxtextBottom
- Bottom bound of text bounding boxpublic boolean beginDocument(TessPdfRenderer tessPdfRenderer, String title)
tessPdfRenderer
- the renderer instance to usetitle
- a title to be used in the document metadatatrue
on success. false
on failurepublic boolean beginDocument(TessPdfRenderer tessPdfRenderer)
tessPdfRenderer
- the renderer instance to usetrue
on success. false
on failurebeginDocument(TessPdfRenderer, String)
public boolean endDocument(TessPdfRenderer tessPdfRenderer)
tessPdfRenderer
- the renderer instance to usetrue
on success. false
on failurepublic boolean addPageToDocument(Pix imageToProcess, String imageToWrite, TessPdfRenderer tessPdfRenderer)
imageToProcess
- image to be used for OCRimageToWrite
- path to image to be written into resulting documenttessPdfRenderer
- the renderer instance to usetrue
on success. false
on failure