org.carrot2.filter.lingo.common
Class AbstractSnippetsIntWrapper

java.lang.Object
  extended by org.carrot2.filter.lingo.util.suffixarrays.wrapper.AbstractIntWrapper
      extended by org.carrot2.filter.lingo.common.AbstractSnippetsIntWrapper
All Implemented Interfaces:
IntWrapper
Direct Known Subclasses:
DefaultSnippetsIntWrapper, MultilingualSnippetsIntWrapper, MultilingualSnippetsIntWrapper.SplitSnippetsIntWrapper

public abstract class AbstractSnippetsIntWrapper
extends AbstractIntWrapper


Nested Class Summary
(package private) static class AbstractSnippetsIntWrapper.WordWrapper
           
 
Field Summary
protected  int distinctWordCount
          Distint word count
protected static String DOCUMENT_DELIMITER
          Document delimiter char
protected  int documentCount
          Input documents
protected  int[] documentIndices
          Document indices corresponding to word indices are stored here to support wordIndex -> documentIndex mapping.
protected  String documentsData
          All documents concatenated
protected  int[] stopWordCodes
          Int codes of stop words
protected  int[] wordPositions
          Starting positions of all words comprising the input documents are stored here to facilitate wordIndexRange -> realString mapping.
 
Fields inherited from class org.carrot2.filter.lingo.util.suffixarrays.wrapper.AbstractIntWrapper
intData
 
Constructor Summary
AbstractSnippetsIntWrapper()
           
 
Method Summary
abstract  Object clone()
           
protected abstract  void createIntData()
           
 int getDistinctWordCount()
           
 int getDocumentCount()
           
 int[] getDocumentIndices()
           
 int[] getStopWordCodes()
           
 String getStringRepresentation(int from, int to)
           
 String getStringRepresentation(Substring substring)
          Method getStringRepresentation.
protected  void setDocuments(String[] documents)
          Method setDocuments.
 
Methods inherited from class org.carrot2.filter.lingo.util.suffixarrays.wrapper.AbstractIntWrapper
asIntArray, length, reverse, toString
 
Methods inherited from class java.lang.Object
equals, finalize, getClass, hashCode, notify, notifyAll, wait, wait, wait
 

Field Detail

DOCUMENT_DELIMITER

protected static final String DOCUMENT_DELIMITER
Document delimiter char

See Also:
Constant Field Values

documentsData

protected String documentsData
All documents concatenated


documentCount

protected int documentCount
Input documents


distinctWordCount

protected int distinctWordCount
Distint word count


stopWordCodes

protected int[] stopWordCodes
Int codes of stop words


wordPositions

protected int[] wordPositions
Starting positions of all words comprising the input documents are stored here to facilitate wordIndexRange -> realString mapping.


documentIndices

protected int[] documentIndices
Document indices corresponding to word indices are stored here to support wordIndex -> documentIndex mapping.

Constructor Detail

AbstractSnippetsIntWrapper

public AbstractSnippetsIntWrapper()
Method Detail

setDocuments

protected void setDocuments(String[] documents)
Method setDocuments.

Parameters:
documents -

createIntData

protected abstract void createIntData()

clone

public abstract Object clone()
Overrides:
clone in class Object
See Also:
Object.clone()

getDocumentIndices

public int[] getDocumentIndices()

getStringRepresentation

public String getStringRepresentation(Substring substring)
Description copied from interface: IntWrapper
Method getStringRepresentation.

Returns:
String

getStringRepresentation

public String getStringRepresentation(int from,
                                      int to)

getStopWordCodes

public int[] getStopWordCodes()

getDocumentCount

public int getDocumentCount()

getDistinctWordCount

public int getDistinctWordCount()


Copyright (c) Dawid Weiss, Stanislaw Osinski