Man page of tw_learn(3) and tw_learn_file(3)
Index
NAME
tw_learn, tw_learn_file - learn characteristics of a category
SYNOPSIS
C/C++ #include <tw.h> tw_errno_t tw_learn(tw_t *tw, const char *cat, const char *str); tw_errno_t tw_learn_file(tw_t *tw, const char *cat, const char *path);
DESCRIPTION
tw_learn() and tw_learn_file() analyse a document's content and learn how to assign similar documents to the very category and to its top-level categories.
tw_learn() processes strings while tw_learn_file() handles documents stored within the file system.
PARAMETERS
Both tw_learn() and tw_learn_file() have the first two parameters in common:
- tw (tw_t *)
-
Pointer to an initialized Textweiser object.
- cat (const char *)
-
Name of the category the document is an example of.
tw_learn() expects as a third parameter:
- str (const char *)
-
The document's content as a string.
tw_learn_file() expects as a third parameter:
- path (const char *)
-
The path to the document within the file system.
RETURN VALUE
Both tw_learn() and tw_learn_file() return an error indicator
(tw_errno_t).
A return value of TW_OK indicates success, any other value
discriminates the occurred error.
The function tw_strerror(3) can be used to obtain a natural language error message.
NOTES
- o
-
Both functions require the input to be plain text and should be in a supported language - see Textweiser's User Manual for details.
- o
-
Both functions require the input to be encoded in UTF-8. If the document is encoded in a different encoding,
TW_ENOSUTFwill be returned as an error code ("Not a supported Unicode Transformation Format"). - o
-
In order to learn a document as an example of a category, the category has to be created in advance using tw_add_category() or tw-admin(1).
- o
-
It is recommended to train each category by learning from at least ten appropriate documents. When learning is completed, a database optimization using either tw-admin(1) or tw_optimize_db(3) may be utilized to speed up classification tasks.
SEE ALSO
tw-learn(1), tw-admin(1)
tw_add_category(3), tw_free(3), tw_strerror(3), tw_optimize_db(3)
Textweiser User Manual
http://www.lingua-systems.com/text-classifier/textweiser-library/
COPYRIGHT
Copyright (c) 2010-2011 Lingua-Systems Software GmbH


