Interface | Description |
---|---|
Transformer |
This interface defines the behaviour expected of Transformers
|
Class | Description |
---|---|
A2BTransformer |
A generic transformer that searches for all occurrences of a pattern (regEx)
`a` in a string and replaces each with a string `b`;
It takes two optional parameters, `removeMultipleWhitespaces` (default true) and
`trimIt` (default true)
|
CapitalLettersExtractor |
CapitalLettersExtractor returns only the capital letters in a string.
|
CompositeTransformer |
A CompositeTransformer can have other transformers and
executes them in order on the given string.
|
DictionaryRegexTransformer |
Uses a
Dictionary object of which it iterates over
the keys to use each as a regular expression; if the pattern matches, it
transforms the string accordingly returning the corresponding value of the
Dictionary. |
DictionaryTransformer |
Uses a
Dictionary object to lookup a string in its
keys and returns the value if the key is found. |
EpithetTransformer |
This transformer canonicalises epithets
|
FakeHybridSignCleaner |
Deletes X and x that seem to be meant as hybrid signs
X and x can be at the beginning of a string followed by a whitespace or
anywhere in the string if surrounded by white-spaces.
|
NormaliseDiacritsTransformer |
Converts a string to its normalised form using unicode's NFD form
(http://unicode.org/reports/tr15/) adding an ASCII equivalent after any
identified diacritical character) and then removes
all non-ASCII characters from the string
In addition to the characters defined by the unicode consortium the following
characters are replaced: [[ADDITIONAL)_REPLACEMENTS]]
|
RegexDefCollection |
A collection of regular expressions that are meaningful to the domain and/or
are used multiple times.
|
RegexExtractorTransformer |
A generic transformer that extracts all occurrences of a pattern (regEx)
in a string
It takes two optional parameters, `removeMultipleWhitespaces` (default true) and
`trimIt` (default true)
|
RemoveBracketedTextTransformer |
Removes all text in brackets (round, square and curly) incl.
|
RomanNumeralTransformer |
This transformer splits a string into a series of words.
|
SafeStripNonAlphanumericsTransformer |
This is just a best-practise chain of three transformations:
(1) replaces diacritic characters with their ASCII equivalent (NormaliseDiacriticsTransformer)
(2) replaces all non-alphanumeric characters with `b` (default: whitespace)
(3) replaces multiple whitespace occurrences with one whitespace
returns a trimmed result
|
SafeStripNonAlphasTransformer |
This is just a best-practise chain of three transformations:
(1) replaces diacritic characters with their ASCII equivalent (NormaliseDiacriticsTransformer)
(2) replaces all non-ASCII characters with `b` (default: whitespace)
(3) replaces multiple whitespace occurrences with one whitespace
returns a trimmed result
|
StringShrinker |
Crops/shrinks a string to a length of `shrinkTo` characters.
|
StripNonAlphabeticCharactersTransformer | Deprecated
Only works on ASCII; use SafeStripNonAlphasTransformer instead.
|
StripNonAlphanumericCharactersTransformer | Deprecated
Only works on ASCII; use SafeStripNonAlphanumericsTransformer instead.
|
StripNonNumericCharactersTransformer |
This transformer strips non numeric characters
|
YearRangeExtractorTransformer |
Extracts year ranges from arbitrary strings.
|
ZeroToBlankTransformer |
This transformer translates zeros to blanks
|
Exception | Description |
---|---|
TransformationException |
Copyright © 2012–2014 Royal Botanic Gardens, Kew. All rights reserved.