You can enter a string, word or clause, or search the text using regular expressions.
The engine does not do full-text search. It simply scans Streitberg's readings verse by verse and tries to match strings, across word boundaries. In other words, if you look for etun ‘they ate’, you'll also get fretun ‘they devoured’ and praufetuns ‘prophets’ – unless you select one of the word-based matching modes, or use regular expression syntax to anchor search patterns manually.
You may enter the thorn and hwair characters directly, if you can, or substitute c resp. v for them. How hwair is shown in the results depends on the display configuration. You can select interlinear translations, if you want.
This is a first version. There are some limitations.
Regular expressions provide a powerful and precise method for searching text. The syntax may seem complicated, but writing expressions is actually quite easy once you know the basics. (They tend to be harder to read.)
[aeuio] | matches one character out of a set of characters (in this case a vowel) |
[a-zþ] | idem, with a range of characters and one additional letter |
\w | matches any word character |
\W | matches any non-word character, i.e. punctuation, witespace etc. |
\b | matches a word boundary |
. | matches any character |
a|bc | matches a OR bc |
* | repeats the preceding character or group 0, 1 or more times |
+ | repeats the preceding character or group 1 or more times |
? | repeats the preceding character or group 0 or 1 time, i.e. makes it optional |
{n,m} | repeats the preceding character or group at least n and at most m times |
You can create complex expressions by grouping basic expressions in parentheses, just like you would build arithmetic expressions using numbers, parentheses and the + and × operators:
(a|b)c | matches ac and bc |
(a|bc)(x|y) | matches ax, ay, bcx and bcy |
The operators *
, ?
and +
are quantifiers. They indicate how many times the preceding character (or group) may be repeated:
suna?us? | matches sunus, sunu, sunau, sunaus |
(a|b)+ | matches one or more times
a or b , i.e. any combination a, b, aa, bb, ab, abba, etc. ad infinitum. |
Quantifiers are not wildcards. Be sure to understand the difference when you normally use * and ? as wildcards. The corresponding regular expressions are:
. | any single character (equivalent to wildcard ? in other systems) |
.* | zero, one or more occurrences of any character (equivalent to wildcard *) |
In practice, you'll often want to use \w
(any word character) rather than .
to match unknown characters:
[A-Z]\w*us | Proper names ending in us. (Contrary to full-text systems, regular expressions are case sensitive by default!) |
We also defined the following non-standard character placeholders:
\V | any Gothic vowel |
\C | any Gothic consonant |
There are many more things you can do. Have a look at this comprehensive overview if you want to know the details. Below are a few examples to get you started.
Most of the examples operate in default matching mode and use \b
or \w
to simulate word-based searches. Alternatively, you can select a word-based matching mode and simply enter the word pattern you want to find in words, or at the start or end of words.
att(an?|ins?)
liuha[þd](a|is)?
hlai[bf]\w*
\bhlai[bf]\w*
hlai[bf]
\betun\b
etun
aza\b
aza
\b(sa|so|þata)\b
sa|so|þata
\C{4,}
[bdfghjklmnpqrstwxzþƕ]{4,}
Kre[kt]\w*
Timeo Danaos et dona ferentes...
(Particularly when they bring paradoxes, as they seem to do in Titus.)Kre[kt]
[A-Z]\w+(a?us?|jus|uns|um|iwe)\b
\bin þ(amma|izai|aim) \w{3,}
·[\w·]+·
\[[^\]]+\]
<[^>]+>
\w+(~\w+)+
\b(\w+)\b( \b\1\b)+
\b(\C)\w+( \1\w+){3,}
\bmiþ[^þ\W]\w*
[^þ\W]
(literally: any character that is not þ and not a not-word-char) as a somewhat quirky way to say any word character except þ.
miþ[^þ\W]
\b(at|af|ana|and|bi|du|ga|in|us)?(\C)ai\2\w+
(at|af|ana|and|bi|du|ga|in|us)?(\C)ai\2\w+
\w*[bdfhjklmnpqrstwxzþ]w[bdfghklmnpqrstwxzþ]\w*
[bdfhjklmnpqrstwxzþ]w[bdfghklmnpqrstwxzþ]
Regular expressions operate on strings over an alphabet. Knowing the alphabet is helpful when writing complex expressions. The text contains these Unicode characters:
aeuio | vowels (lower- and uppercase) |
bdfghjklmnpqrstwxzþƕ | consonants (lower- and uppercase) |
ï |
Esaïan, gaïddja, ... (more) |
û |
þû is sa qimanda
|
.,:;?! | punctuation |
“ ” ‘ ’ | quotation marks (in Skeireins) |
~ | assimilation: jan~ni(= jah ni) |
· | midpoint delimits numbers: du Kaurinþium ·b· ustauh
|
— | em dash |
< > [ ] | additions and deletions by Streitberg |
_ | underscore marks a gap in a word, e.g. . . . . _teins þis balsanis warþ? |
As mentioned above, you may substitute c for þ and v for ƕ (hwair).