|
Carrot2 Framework
API Specification |
||||||||
| PREV CLASS NEXT CLASS | FRAMES NO FRAMES | ||||||||
| SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD | ||||||||
public interface CaseNormalizer
Brings the case of all tokens in all input tokenized documents's titles and
snippets to one common form. This process can be thought of as 'stemming for
case'.
All input tokens must be subclasses of
StringTypedToken
interface. The input documents will get modified --their tokens will
get overwritten with case-normalized versions. Token types will be preserved.
No support is provided for the full text of documents. This class is not
thread-safe.
| Method Summary | |
|---|---|
void |
addDocument(TokenizedDocument document)
Adds a document to the normalization engine. |
void |
clear()
Clears this instance so that it can be reused with another set of documents. |
List |
getNormalizedDocuments()
Returns a List of case normalized documents. |
| Method Detail |
|---|
void clear()
void addDocument(TokenizedDocument document)
IllegalStateException - when an attempt is made to add documents
after the getNormalizedDocuments()has been called.List getNormalizedDocuments()
clear()method. Note: it is in this method
that document's tokenks get modified.
|
Please refer to project documentation at
http://project.carrot2.org |
||||||||
| PREV CLASS NEXT CLASS | FRAMES NO FRAMES | ||||||||
| SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD | ||||||||