This is an analyzer for Akkadian nouns written in Python that utilizes regular expressions and very mild constraint parsing for state disambiguation. Case, number, gender and state are found through regex pattern matching while disambiguation is found through very simple unification. As the construct and absolute states are both unmarked, disambiguation is done through checking the unmarked noun at hand and the noun that precedes/succeeds it (if either exist). If it is succeeded by a noun in the genitive, it is known to be possessed by that noun and so in the construct state, while if it's preceded by a noun in the nominative, it's known to be a predicate and so in the absolute state.
Akkadian is an ancient East Semitic language formerly spoken in Mesopotamia from about the 2nd millennium BC to the 5th century BC. It was the language of the Akkadian Empire, often considered the first known imperial regime, and the subsequent Mesopotamian empires (Old and Middle Assyrian empires, Babylonia, and the early Neo-Assyrian empire). when Aramaic was included as a language of government around the 8th century BC, the Akkadian language's use slowly dwindled. It is the language used for the Code of Hammurabi, one of the earliest legal codes written as well as the language of the Epic of Gilgamesh, the oldest surviving great work of literature.
As a Semitic language, it has many of the same features as others; non-concatenative root and stem based morphology, 3 cases (which Arabic conserved, unlike Aramaic and Hebrew), a state system (with three states, governed, absolute and construct, though the absolute was less used), two genders (masculine and feminine) with the marking for feminine being a -t suffix (just like Arabic and to a lesser extent Hebrew). I chose it precisely because it retained many of the interesting features of Semitic languages, while not being comparatively difficult to work with. Classical/MS Arabic and Biblical Hebrew grammar would require a lot more work, and I wanted a proof of concept for implementing more robust analyzers/parsers on other Semitic languages, and the potential limitations and advantages of regular languages/regexes in morphology and tokenization.
For more information about its development, read my code report here. Any other questions may be directed to my email codexderelict@proton.me. I accept any messages, related to this project or any of my other ones or not.