The idea of quantifying Moby-Dick is simultaneously exciting and perhaps not altogether surprising given the results of some of the returns from the tools we were instructed to use. The novel is packed with Shakespearean language, is about a very specialized topic (whaling), and formally very odd in places. But that, of course, just means Moby-Dick is an ideal text for these sorts of experiment, right? Let’s see…
First, I ran Moby-Dick Wordle, resulting in this diagram:
Secondly, WordItOut:
The most obvious difference between the two is the choice for the largest word. ‘Whale’ and ‘one’, are unsurprisingly the largest words represented on the image. WordItOut, however, displays ‘all’ as its largest word, with ‘whale’ and ‘one’ the runners-up. The word ‘all’ is not represented on Wordle’s image, meaning it is cast aside in that program as an all-too-common word to be of any use. Now, I do see the logic in this decision in some form; ‘all’ is a common word, and sometimes can be used as a needless intensifier or a purely quantitative word. In this case, however, I contest Wordle’s decision; in Ahab’s final monologue he explicitly describes Moby Dick as “all-destroying” as he speeds, harpoon in hand, towards the beast that is destroying his ship. The ‘all’ in this case is not just a simple word, it’s an intensifier certainly, but it represents Ahab’s life (the whaling trade), and Ahab himself (his soul has been scarred and his body maimed). It is possible to read this word with more than the mere commonality ascribed to it by Wordle’s software.
Secondly, the major characters of the novel are mentioned: Queequeg, Stubb, Starbuck, and Ahab, but there are some missing. Ishmael is gone despite being the narrator, but aside from the opening sentence, his name is barely mentioned if at all (mostly just annotations ever recall to his name). More interesting, though, is the absence of one of Ahab’s right-hand men: Flask. Naturally, this means he is mentioned less, or at least referred to by name fewer times than the other first mates of the Pequod, but perhaps this opens up a line of inquiry to pursue: why are Starbuck and Stubb getting so much attention as to appear quantitatively more visible?
Next, I placed the contents of the word cloud into the Up-Goer Five, receiving the expected list of forbidden words:
Stubb, stub, brush, check, end, point, boats, captain, sperm, sea, ship, thou, nor, boat, Ahab, ye, whales, deck, Queequeg, Starbuck, chapter, whale, among
This list can be divided easily into three categories: Names (Stubb, Ahab, Queequeg, and Starbuck), archaisms (thou, ye, nor), and nautical terms (stub, brush, check, point, boats, sperm, sea, ship, boat, whales, deck, whale). None of these are surprising to see on the list considering the names are odd, the archaisms by definition not going to be common, and our modern society is less reliant on ship-trade as to render the nautical terms more scarce, and I would guess they wouldn’t appear in the top 1000 words in 1851 either.
The interesting remainders are end and among, which, I’ll admit, I am surprised are not within the ten hundred most used words.
Next comes the CLAWS speech tagger. This tool, as Mary and Dan reported, is not only less visually appealing, but less clear to someone not familiar with its format to read. But the tool was surprisingly good at recognizing the propers nouns (Queequeg, Stubb, Starbuck, and Ahab) as such, and not returning some sort of error or even just suggesting them as nouns. Since proper nouns are typically dependent upon context to recognize, CLAWS’ ability to recognize them is impressive. Aside from the names, there are mostly nouns and adjectives represented by list, with a few prepositions (upon, among) and an interjection (oh), but fewer verbs than I expected, with only five by my count: said, cried, go, thought, and know.
Finally, with the TAPoR/Voyant tool, I found myself lucky that the first chapter of Moby-Dick was a default on the website. Unfortunately, the diagnostic returned was not all that interesting, so I went ahead and uploaded the entire text.
The cloud, or ‘cirrus’, for Voyant is prone to including “useless” words, as you can see, like articles, but fortunately, while it does not take the liberty that both Wordle and WordItOut do with automatically removing certain words (and thereby removing some potentially important words, as in the case of ‘all’) it allows you to customize your list and essentially blacklist the words you do not want. Wordle as well provided this feature, but removed words by default. Voyant forces the uploader to think and choose the words represented.
As you can see in the screenshot, the first word I selected that seemed, to me, to be worth scanning was ‘whale’, with a total of 971 uses beginning on the very first page. What is fascinating about Voyant are the multiple ways it will contextualize and build information around a single word. There are two windows dedicated to showing a frequency chart and the context around each mention as well as tabs for the parts of the entire corpus of where your chosen word (or words) appears. This helps to alleviate any suspicion, especially when dealing with an ambiguous word (unlike ‘whale’) that may have multiple uses and contexts.
Looking at the use ‘whale’ throughout the entire book, I would be tempted to explore the periodic lull in its mentions visible in the line graph. When the graph is given 10 and 15 segments, this oscillations are more drastic and shows much more sporadic mentions of the term, though the most interestingly, what can be seen is a steady decline in the use of ‘whale’ until what starts the final chapters of the book, or, the chase sequence, in which case it begins a steep incline. There is seemingly a dramatic tension in the graph recognizable through its usage of the term.
So, when I think about Ramsay’s idea of “estrangement” from textuality, I have to wonder about what it is within the text, or about the text that is primary subject of estrangement. Is it the narrative? For ever instant my initial responses have been grounded within the narrative: why is Flask mentioned less? Why is the word ‘all’ important to the word cloud to be a significant loss? What time frame is represented by the steep incline at the end of the line graph? All of these questions are brought about because of my familiarity with the reading: a product of the close-reading focused education that enforced that I read Moby-Dick because it, singularly, is important and above thousands of anonymous books. But when it comes to the answers of my questions, are they all necessarily going to return to the narrative? Personally, it seems the temporary estrangement is merely a way of refocusing the narrative again and re-reading it, arriving at Ramsay’s purported goal: creating new information and criticism from what the algorithms can show us.