For those who don't have time to work through the details or need help keeping the big picture in mind, I will briefly summarize/simplify the steps that go into this process at present.
1. Transcribe all extant Greek witnesses into an electronic format (XML). When I worked for the IGNTP project on John, two transcribers transcribed the text of each manuscript independently by changing a base text (textus receptus) to match the manuscript being transcribed. They include information about text, layout, lacunae, and corrections in a Word document according to a set pattern. A senior scholar then automatically compared the two transcriptions and personally reconciled any conflicts, yielding extremely accurate final transcriptions. A technical officer then ran a script to convert the transcriptions into XML format. I think they are now starting to use an online transcription editor that creates the XML directly. In my opinion, NT scholars do a much better job than OT scholars at exhausting the direct manuscript tradition and storing the raw data in electronic format, though they probably have much to learn from OT scholars on the use of versions, which unfortunately feature minimally in the CBGM.
2. Automatically collate the electronic texts and correct the automatic collations as a human editor to isolate appropriate variation units (strings of text where variation occurs) and the variant readings. In this process, the editor "regularizes" the spelling to eliminate minor orthographic differences and identify meaningful agreement and disagreement in collated variant readings.
3. Automatically calculate the statistical proximity of each text to each other text based on the percentage agreement in variation units. They call this "pre-genealogical coherence," because it reflects their absolute statistical level of agreement without respect to the nature of the agreements or actual genealogical relationships. In other words, variants are counted first, and only later weighed and recounted.
4. Attempt to relate the variant readings in each variation unit to each other by a local stemma (i.e., a stemma of variant readings) that explains the direction of development of particular readings from other readings. At this point, textual critics use the internal and external evidence (including ancient versions) in basically the same way as TC has traditionally been practiced. Pre-genealogical coherence serves as an important criterion in evaluating external evidence at this stage.
5. By evaluating and recording each variation unit (though it is possible to include only variation units that can be confidently adjudicated at the initial stage, if you want) on its own, you create a database of all your decisions about the initial text and local stemmata of readings.
6. Since this database records both your proposals for the initial text and the directions of development of readings from others, the computer can work out the ramifications of your decisions for the textual flow of the tradition called the "genealogical coherence." It can also flag up decisions which create incoherent pictures of textual transmission, which can then be reconsidered in closer detail. For instance, if you said that one reading developed from another, but that decision implies textual relationships that do not fit with your other decisions, you can reconsider the decision in light of a clearer picture the relationships between textual states. It is important to note that the computer does not make the textual decisions for you, but only keeps track of all your previous decisions simultaneously and points out inconsistencies.
7. Once problematic variation units are flagged up, you can reexamine them in light of the picture of the general textual flow resulting from your decisions. If you did not come to a conclusion about a particular issue in your initial evaluation, you can now reexamine it with the additional information as well. In other words, you can use the overall consistency or "coherence" of the textual relationships resulting from your text-critical decisions as a criterion for reevaluating and refining your prior conclusions. This is where the complicated parts of the CBGM really come in, and I really haven't used it enough to give a thorough treatment, but I'll try to highlight a few important points.
- Texts are abstracted from the material contexts in which they are found (a major point of controversy), so a historically later "text" can theoretically be hypothesized as a source for a historically earlier "text." The implication of this is that the CBGM does not produce a stemma of manuscripts intended to reflect the historical relationships between manuscripts, but rather the general flow of the text between known states of the text. In this regard, the CBGM is better suited for reconstructing the initial text and isolating important texts than it is to tracing the history of the text in real time.
- Search for an optimal number of source texts to explain each known state of the text. On the assumption of a generally conservative transmissional tendency, try to identify which texts would be required to provide sufficient source material to explain the origins of the readings without needlessly multiplying sources. Texts can have multiple ancestors, but the number of ancestors hypothesized should be kept to the minimum required to explain the evidence sufficiently. Potential ancestors that explain many readings in the text are to be preferred to those with only occasional helpful source material. Potential ancestors should be sought in documented states of the text, not reconstructed hyparchetypes. Again, this process is not making a claim that the manuscript itself was copied from these sources, but only that its text somehow inherited text from the textual states now documented in the other manuscripts.
- Local stemmata that fit coherently within the general textual flow are typically to be preferred to those that do not.
- Readings emerging in distant texts may be explained as contamination/mixture or as having been created independently multiple times in the tradition. In other words, the overall coherence of the tradition can be used to clarify just how "indicative" a variant reading is for genealogical relationships. Some readings that are shared secondary readings may simply be coincidence or from occasional mixture.
- On the supposition that the impact of contamination/mixture between different texts was typically less extensive than that of normal copying processes, the coherence of the general textual flow can be used to minimize or bypass the occasional effects of contamination and accurately trace the directions of the flow of the text, even in an open (i.e., contaminated) system.
I hope that this simplistic summary of a very complex (but not impenetrable) process will be helpful, and of course NT colleagues are welcome to offer corrections to any misrepresentations! I applied some basic principles of this approach in my dissertation (though not fully) on the Dead Sea Scrolls containing Exodus to great effect, and I would encourage consideration by other textual scholars looking to expand their methodological repertoire.