Search the Gothic Bible

Search help

You can enter a string, word, or clause, or search the text using regular expressions.

The engine does not perform full-text search. It simply scans Streitberg’s readings verse by verse and tries to match strings, even across word boundaries. In other words, if you search for etun ‘they ate’, you’ll also get fretun ‘they devoured’ and praufetuns ‘prophets’ – unless you select one of the word-based matching modes, or use regular expression syntax to anchor your search patterns manually.

You may enter the thorn and hwair characters directly, if you can, or substitute c and v, respectively. You can select interlinear translations, if you wish.

This is a first version. There are some limitations.

Pattern matching

Regular expressions provide a powerful and precise method for searching text. The syntax may seem complicated at first, but writing expressions is actually quite easy once you know the basics. (They tend to be harder to read.)

[aeuio]	matches one character out of a set of characters (in this case a vowel)
[a-zþ]	idem, with a range of characters and one additional letter
\w	matches any word character
\W	matches any non-word character, i.e. punctuation, white space etc.
\b	matches a word boundary
.	matches any character
a\|bc	matches a OR bc
*	repeats the preceding character or group 0, 1 or more times
+	repeats the preceding character or group 1 or more times
?	repeats the preceding character or group 0 or 1 time, i.e. makes it optional
{n,m}	repeats the preceding character or group at least `n` and at most `m` times

You can create complex expressions by grouping basic expressions in parentheses, just like you would write arithmetic expressions with numbers, parentheses and the + and × operators:

(a\|b)c	matches ac and bc
(a\|bc)(x\|y)	matches ax, ay, bcx and bcy

The operators *, ? and + are quantifiers. They indicate how many times the preceding character (or group) may be repeated:

suna?us?	matches sunus, sunu, sunau, sunaus
(a\|b)+	matches one or more times `a` or `b`, i.e. any combination a, b, aa, bb, ab, abba, etc. ad infinitum.

Quantifiers are not wildcards. Be sure to understand the difference when you are used to * and ? as wildcards. The corresponding regular expressions are:

.	any single character (equivalent to wildcard ? in other systems)
.*	zero, one or more occurrences of any character (equivalent to wildcard *)

In practice, you’ll often want to use \w (= any word character) rather than . to match unknown characters:

[A-Z]\w*us

Proper names ending in us. (Contrary to full-text systems, regular expressions are case sensitive by default!)

To allow precise matching, we redefined \w and its counterpart \b (word boundary) for Gothic. We also defined these non-standard character placeholders:

\V	any Gothic vowel
\C	any Gothic consonant

There’s much more to regular expressions than shown here. Since the search runs client-side — directly in your browser — any expression supported by your browser’s JavaScript engine will work. Below are a few examples to get you started.

Examples

Most of the examples use the default matching mode and rely on \b or \w to simulate word-based searches. Alternatively, you can select a word-based matching mode from the drop-down menu and simply enter the pattern you want to find within words, or at the start or end of words.

att(an?|ins?): Matches atta, attan, attin and attins.
liuha[þd](a|is)?: Matches singular case forms of liuhaþ.
hlai[bf]\w*: Matches hlaifs, hlaif, hlaiba, hlaibis, ...
\bhlai[bf]\w*: The same, but with initial word boundary, excluding gahlaibaim.; = match at start of word + hlai[bf]
\betun\b: Matches the word etun.; = match exact word + etun
aza\b: Matches words that end in aza, e.g. passive verb forms.; = match at end of word + aza
\b(sa|so|þata)\b: Demonstrative pronoun (or article) nominative singular.; = match exact word + sa|so|þata
\C{4,}: Matches clusters of 4 consonants or more.; = [bdfghjklmnpqrstwxzþƕ]{4,}
Kre[kt]\w*: Timeo Danaos et dona ferentes... (Particularly when they bring paradoxes, as they seem to do in Titus.); = match words + Kre[kt]
[A-Z]\w+(a?us?|jus|uns|um|iwe)\b: Finds proper names with u-declension. (Remember that regular expressions are case-sensitive by default. Conveniently, Streitberg’s text only uses capitals for names and chapter initials. (There seem to be some exceptions though.))
\bin þ(amma|izai|aim) \w{3,}: Finds in + dative pronoun + any word that is at least 3 characters long.
·[\w·]+·: Finds numbers like ·ib·.
\[[^\]]+\]: Finds text deleted by Streitberg.
<[^>]+>: Finds text added by Streitberg.
\w+(~\w+)+: Finds words that show enclisis and/or assimilation (marked with ~).
\b(\w+)\b( \b\1\b)+: Finds repeated words. Literally, the expression reads ‘one or more word characters between word boundaries, captured as a group, followed by one or more occurrences of space followed by the first captured string between word boundaries’.
\b(\C)\w+( \1\w+){3,}: Alliteration? (Probably accidental...)
\b(at|af|ana|and|bi|du|ga|in|us)?(\C)ai\2\w+: Matches reduplicating verb forms with optional prefixes (resulting in some false positives, e.g. taitrarkes).; = match at start of word + (at|af|ana|and|bi|du|ga|in|us)?(\C)ai\2\w+
\w*[bdfhjklmnpqrstwxzþ]w[bdfghklmnpqrstwxzþ]\w*: Finds words that have ⟨w⟩ between consonants, i.e. when it represents Greek ⟨υ⟩, as in swnagogei, but excluding -ggw- in triggws and -wj- in manwjan e.a.; = match words + [bdfhjklmnpqrstwxzþ]w[bdfghklmnpqrstwxzþ]

Characters

Regular expressions operate on strings over an alphabet. Knowing the alphabet is helpful when writing complex expressions. The text contains these Unicode characters:

aeuio	vowels (lower- and uppercase)
bdfghjklmnpqrstwxzþƕ	consonants (lower- and uppercase)
ï	Esaïan, gaïddja, ... (more)
û	þû is sa qimanda
.,:;?!	punctuation
“ ” ‘ ’	quotation marks (in Skeireins)
~	assimilation: jan~ni (= jah ni)
·	midpoint delimits numbers: du Kaurinþium ·b· ustauh
—	em dash
< > [ ]	additions and deletions by Streitberg
_	underscore marks a gap in a word, e.g. . . . . _teins þis balsanis warþ?

As mentioned above, you may substitute c for þ and v for ƕ (hwair).

Known issues and limitations

The search engine scans only the Gothic text. You won’t get results for John 1:1, Wulfila, dative, Christ, Streitberg, Codex Argenteus, etc. (Use Google or another search engine for those.)
It searches the text as edited by Streitberg, including editorial marks for additions <...> and deletions [...]. Since these may occur within words, you may occasionally miss results you’d expect. (A more sophisticated engine is planned that will search across different views of the text.)
Searching for headwords or filtering by grammatical tags is not yet supported. (However, thanks to Gothic’s rich inflectional morphology, you can get quite far by filtering on endings or other morphological features, as shown in the examples.)
It is not (yet) possible to use the Gothic alphabet in your queries, and transliteration settings are currently ignored. (You can, however, select translations.)
The search engine runs in your browser using JavaScript. It should work in any common post-2020 browser; let us know if it doesn’t. There may be a slight delay when opening the page for the first time, while the search data is downloaded to your device.
Since the engine runs on your own system, don’t bother with pathological or maliciously crafted expressions to bring down the server. You’ll simply crash your browser tab.
To avoid precisely that, there is a limit on the number of matches. If your search expression yields thousands of results, they will be truncated. Given the small size of the corpus, we prefer to return all results at once, without paging. (This is convenient if you want to print or copy results.) But there are better ways to download the full text than by entering a query that matches every line, word, or letter in the Gothic Bible.
Finally: this is a beta version. Unexpected glitches may occur. Please let us know if something goes wrong, if something could be improved, or if you think of interesting example queries.