Friday, November 9, 2007

Database technology and the Bible

John Hobbins has outlined his requirements for a research database for the Study of Ancient Hebrew Poetry. (see here)

The overview of recent research provided in the preceding posts illustrates the degree to which parallel structures characterize ancient Hebrew verse at every level of the textual hierarchy. A research database designed to facilitate the study of the phenomenon would ideally have the following features. Components of texts presumed to be poetry would be tagged at the macro-structural, prosodic, semantic, syntactic, morphological, and sonic levels. Each of the six levels, to be sure, is multidimensional.

My own specialty over the past 40 years has been in data analysis and database structures. There is a vast gulf between natural language and database. I have not had a chance yet to do much study of a tagged text such as is in the morphological parsing and tagging from the Westminster Theological Seminary.

Here is another undated statement of requirement:

At present there does not exist a freely available syntactic database and corresponding search engine for the Biblical Hebrew and Aramaic syntactician and discourse analyst. Further, there is no standard, universally available database whereby the scientific community can repeat and verify the results of such study. Such a database would permit the researcher to make comprehensive statements about the behaviour of Biblical Hebrew syntax and textgrammar. Since it is first and foremost a research tool, a fundamental requirement is that the data and analysis are completely accessible and configurable by the researcher to reflect varying theories and improved understanding of the text and theories used to investigate the text.
I am sure the techniques are legion and incompatible. The approach I would take would be to discover the objects that are implied and their relationships - has anyone done this type of analysis yet? Some years ago I did a bit of analysis for a text-based database linking verse with scholar. It would not do as a database for what I am doing now or for what John wants, though some of its entities might be extendable in that direction (see this entity-relationship diagram).

There are two related problems to database. The first is design - the 'right' and 'extendible' set of objects. We know there are such objects because people use books and verses to hit each other with - but this is not necessarily the right starting point. The second problem is loading the data from a verifiable source. Both these are almost intractable given the explosion in thinking about Bible texts that is evident in the noo-sphere today. But if we do find the right structure, people will understand what Fred Brooks wrote in the Mythical Man Month in the '60s: "show me your logic and I will be mystified, show me your data and I won't need your logic to understand you."

