Only spammers seem to be noticing this blog, but for web-trolling software that might be interested in digital humanities and philology I thought I might add that I have updated the sample output from Collatex.
collatex-table-apparatus.html shows output from user-specified witnesses in the form of (1) an alignment table based on user-specified order, (2) an extracted text of a base text (taking the first specified witness is the base text), (3) generating an apparatus.
CollateX is not perfect. Some of the output problems are the result of tokenizing (the samples used were tokenized very coarsely) and can be fixed. Abbreviations and the phenomenon of connected or unconnected prepositions (של, also words such as כיצד) can also be fixed. But some errors have to do with how CollateX deals with edit distance. Not sure how we are going to handle this.
Hayim Lapin is Robert H. Smith Professor of Jewish Studies and Professor in the Department of History at the University of Maryland. He currently is completing a faculty fellowship at MITH. This post originally appeared at Digital Mishnah on January 2, 2012.