| Interface | Description |
|---|---|
| Transformer |
This interface defines the behaviour expected of Transformers
|
| Class | Description |
|---|---|
| A2BTransformer |
A generic transformer that searches for all occurrences of a pattern (regEx)
`a` in a string and replaces each with a string `b`;
It takes two optional parameters, `removeMultipleWhitespaces` (default true) and
`trimIt` (default true)
|
| CapitalLettersExtractor |
CapitalLettersExtractor returns only the capital letters in a string.
|
| CompositeTransformer |
A CompositeTransformer can have other transformers and
executes them in order on the given string.
|
| DictionaryRegexTransformer |
Uses a
Dictionary object of which it iterates over
the keys to use each as a regular expression; if the pattern matches, it
transforms the string accordingly returning the corresponding value of the
Dictionary. |
| DictionaryTransformer |
Uses a
Dictionary object to lookup a string in its
keys and returns the value if the key is found. |
| EpithetTransformer |
This transformer canonicalises epithets
|
| FakeHybridSignCleaner |
Deletes X and x that seem to be meant as hybrid signs
X and x can be at the beginning of a string followed by a whitespace or
anywhere in the string if surrounded by white-spaces.
|
| NormaliseDiacritsTransformer |
Converts a string to its normalised form using unicode's NFD form
(http://unicode.org/reports/tr15/) adding an ASCII equivalent after any
identified diacritical character) and then removes
all non-ASCII characters from the string
In addition to the characters defined by the unicode consortium the following
characters are replaced: [[ADDITIONAL)_REPLACEMENTS]]
|
| RegexDefCollection |
A collection of regular expressions that are meaningful to the domain and/or
are used multiple times.
|
| RegexExtractorTransformer |
A generic transformer that extracts all occurrences of a pattern (regEx)
in a string
It takes two optional parameters, `removeMultipleWhitespaces` (default true) and
`trimIt` (default true)
|
| RemoveBracketedTextTransformer |
Removes all text in brackets (round, square and curly) incl.
|
| RomanNumeralTransformer |
This transformer splits a string into a series of words.
|
| SafeStripNonAlphanumericsTransformer |
This is just a best-practise chain of three transformations:
(1) replaces diacritic characters with their ASCII equivalent (NormaliseDiacriticsTransformer)
(2) replaces all non-alphanumeric characters with `b` (default: whitespace)
(3) replaces multiple whitespace occurrences with one whitespace
returns a trimmed result
|
| SafeStripNonAlphasTransformer |
This is just a best-practise chain of three transformations:
(1) replaces diacritic characters with their ASCII equivalent (NormaliseDiacriticsTransformer)
(2) replaces all non-ASCII characters with `b` (default: whitespace)
(3) replaces multiple whitespace occurrences with one whitespace
returns a trimmed result
|
| StringShrinker |
Crops/shrinks a string to a length of `shrinkTo` characters.
|
| StripNonAlphabeticCharactersTransformer | Deprecated
Only works on ASCII; use SafeStripNonAlphasTransformer instead.
|
| StripNonAlphanumericCharactersTransformer | Deprecated
Only works on ASCII; use SafeStripNonAlphanumericsTransformer instead.
|
| StripNonNumericCharactersTransformer |
This transformer strips non numeric characters
|
| YearRangeExtractorTransformer |
Extracts year ranges from arbitrary strings.
|
| ZeroToBlankTransformer |
This transformer translates zeros to blanks
|
| Exception | Description |
|---|---|
| TransformationException |
Copyright © 2012–2014 Royal Botanic Gardens, Kew. All rights reserved.