Lisa Rhody – Maryland Institute for Technology in the Humanities

Why use visualizations to study poetry?

Lisa Rhody — Mon, 07 May 2012 13:00:48 +0000

The research I am doing presently uses visualizations to show latent patterns that may be detected in a set of poems using computational tools, such as topic modeling. In particular, I’m looking at poetry that takes visual art as its subject, a genre called ekphrasis, in an attempt to distinguish the types of language poets tend to invoke when creating a verbal art that responds to a visual one. Studying words’ relationships to images and then creating more images to represent those patterns calls to mind a longstanding contest between modes of representation—which one represents information “better”? Since my research is dedicated to revealing the potential for collaborative and kindred relationships between modes of representation historically seen in competition with one another, using images to further demonstrate patterns of language might be seen as counter-productive. Why use images to make literary arguments? Do images tell us something “new” that words cannot?

Without answering that question, I’d like instead to present an instance of when using images (visualizations of data) to “see” language led to an improved understanding of the kinds of questions we might ask and the types of answers we might want to look for that wouldn’t have been possible had we not seen them differently—through graphical array.

Currently, I’m using a tool called MALLET to create a model of the possible “topics” found in a set of 276 ekphrastic poems. There are already several excellent explanations of what topic modeling is and how it works (many thanks to Matt Jockers, Ted Underwood, and Scott Weingart who posted these explanations with humanists in mind), so I’m not going to spend time explaining what the tool does here; however, I will say that working with a set of 276 poems is atypical. Topic modeling was designed to work on millions of words, and 276 poems doesn’t even come close; however, part of the project has been to determine a threshold at which we can get meaningful results from a small dataset. So, this particular experiment is playing with the lower thresholds of the tool’s usefulness.

When you run a topic model (train-topics) in MALLET, you tell the program how many topics to create, and when the model runs, it can output a variety of results. As part of the tinkering process, I’ve been working with the number of topics to have MALLET use in order to generate the model, and was just about to despair that the real tests I wanted to run wouldn’t be possible at 276 poems. Perhaps it was just too few poems to find recognizable patterns. For each topic assignment, MALLET assigns an ID number to the topic and “topic keys” as keywords for that topic. Usually, when the topic model is working, the results are “readable” because they represent similar language. MALLET would not call a topic “Sea,” for example, but might instead provide the following keywords:

blue, water, waves, sea, surface, turn, green, ship, sail, sailor, drown

The researcher would look at those terms and think, “Oh, clearly that’s a nautical/sea/sailing” topic, and dub it as such. My results, however, on 15 topics over 276 poems were not readable in the same way. For example, topic 3 included the following topic keys:

3 0.04026 with self portrait him god how made shape give thing centuries image more world dread he lands down back protest shaped dream upon will rulers lords slave gazes hoe future

I don’t blame you if you don’t see the pattern there. I didn’t. Except, well, knowing some of the poems in the set pretty well, I know that it put together “Landscape with the Fall of Icarus” by W.C. Williams with “The Poem of Jacobus Sadoletus on the Statue of Laocoon” with “The New Colossus” with “The Man with the Hoe Written after Seeing the Painting by Millet.” I could see that we had lots of kinds of gods represented, farming, and statues, but that’s only because I knew the poems. Without topic modeling, I might put this category together as a “masters” grouping, but it’s not likely. Rather than look for connections, I was focused on the fact that the topic keys didn’t make a strong case for their being placed together, and other categories seemed similarly opaque. However, just to be sure that I could, in fact, visualize results of future tests, I went ahead and imported the topic associations by file. In other words, MALLET can also produce a file that lists each topic (0-14 in this case) with each file name in the dataset and a percentage. The percentage represents the degree to which the topic is represented inside each file. I imported the MALLET output of topics and files associated with them into Google Fusion Tables and created a dynamic bar graph that collects file-ids along the vertical axis and along the horizontal axis can be found the degree that the given topic (in this case topic 3) is present in the file. As I clicked through each topic’s graph, I figured I was seeing results that demonstrated MALLET’s confusion, since the dataset was so small. But then I saw this:

[Below should be a Google Visualization. You may need to “refresh” your browser page to see it. If you still cannot see it, a static version of the file is visible here.]

If the graph’s visualization is working, when you pass your mouse over the lines in the bar graph, the ones that are higher than 0.4, then the file-id number (a random number assigned during the course of preparing the data) appears. Each of these files begin with the same prefix: GS. In my dataset, that means that the files with the highest representation of topic 3 in them can all be found in John Hollander’s collection The Gazer’s Spirit. This anthology is considered to be one of the most authoritative and diverse—beginning with classical ekphrasis all the way up to and including poems from the 1980s and 1990s. I had expected, given the disparity in time periods, that the poems from this collection would be the most difficult to group together because the diction of the poems changes dramatically from the beginning of the volume to the end. In other words, I would have expected the poems to blend with the other ekphrastic poems throughout the dataset more in terms of their similar diction than by anything else. MALLET has no way of knowing that these files are included in the same anthology. All of the bibliographical information about the poems has been stripped from the text being tested. There has to be something else. What something else might be requires another layer of interpretation. I will need to return to the topic model to see if a similar pattern is present when I use other numbers of topics—or if I include some non-ekphrastic poems to the set being tested—but seeing the affinity in language between the poems included in The Gazer’s Spirit in contrast to other ekphrastic poems proved useful. Now, I’m not inclined to throw the whole test away, but instead to perform more tests to see if this pattern emerges again in other circumstances. I’m not at square one. I’m at a square 2 that I didn’t expect.

The visualization in the end didn’t produce “new knowledge.” It isn’t hard to imagine that an editor would choose poems that construct a particular argument about what “best” represents a particular genre of poetry; however, if these poems did truly represent the diversity of ekphrastic verse, wouldn’t we see other poems also highly associated with a “Gazer’s Spirit topic”? What makes these poems stand out so clearly from others of their kind? Might their similarity mark a reason for why critics of the 90s and 2000s define the tropes, canons, and traditions of ekphrasis in a particular vein? I’m now returning to the test and to the texts to see what answers might exist there that I and others have missed as close readers. Could we, for instance, run an analysis that determines how closely other kinds of ekphrasis are associated with Gazer’s Spirit’s definition of ekphrasis? Is it possible that poetry by male poets is more frequently associated with that strain of ekphrastic discourse than poetry by female poets?

This particular visualization doesn’t make an “argument” in the way humanists are accustomed to making them. It doesn’t necessarily produce anything wholly “new” that couldn’t have been discovered some other way; however, it did help this researcher get past a particular kind of blindness and helped me to see alternatives—to consider what has been missed along the way—and there is, and will be, something new in that.

Lisa Rhody is a Ph.D. candidate in English at the University of Maryland, a Spring 2012 MITH Winnemore Dissertation Fellow, and a lecturer on the arts for the Virginia Museum of Fine Arts. This post first appeared on Lisa’s personal blog on April 30th, 2012.

The post Why use visualizations to study poetry? appeared first on Maryland Institute for Technology in the Humanities.

Small Projects & Limited Datasets

Lisa Rhody — Mon, 02 Apr 2012 13:37:24 +0000

I’ve been thinking a lot lately about the significance of small projects in an increasingly large-scale DH environment. We seem almost inherently to know the value of “big data:” scale changes the name of the game. Still, what about the smaller universes of projects with minimal budgets, fewer collaborators, and limited scopes, which also have large ambitions about what can be done using the digital resources we have on hand? Rather than detracting from the import of big data projects, I, like Natalie Houston, am wondering what small projects offer the field and whether those potential outcomes are relevant and useful both in and of themselves as well as beneficial to large-scale projects, such as in fine-tuning initial results.

My project in its current iteration involves a limited dataset of about 4500 poems and challenges rudimentary assumptions about a particular genre of poetry called ekphrasis—poems regarding the visual arts. It is the capstone project to a dissertation in which I use the methods of social network analysis to explore socially-inscribed relationships between visual and verbal media and in which the results of my analysis are rendered visually to demonstrate the versatility and flexibility available to female poets writing ekphrastic poetry. My MITH project concludes my dissertation by demonstrating that network analysis is one way of disrupting existing paradigms for understanding the social-signification of ekphrastic poetry, but there are more methods available through computational tools such as text modeling, word frequency analysis, and classification that might also be useful.

To this end, I’ve begun by asking three modest questions about ekphrastic poetry using a machine learning application called MALLET:

Could a computer learn to differentiate between ekphrastic poems by male and female poets? In “Ekphrasis and the Other,” W.J.T. Mitchell argues that were we to read ekphrastic poems by women as opposed to ekphrastic poetry by men, that we might find a very different relationship between the active, speaking poetic voice and the passive, silent work of art—a dynamic which informs our primary understanding of how ekphrastic poetry operates. Were this true and were the difference to occur within recurring topics and language use, a computer might be trained to recognize patterns more likely to co-occur in poetry by men or by women.
Will topic modeling of ekphrastic texts pick out “stillness” as one of the most common topics in the genre? Much of the definition of ekphrasis revolves around the language of stillness: poetic texts, it has been argued, contemplate the stillness and muteness of the image with which it is engaged. Stillness, metaphorically linked to muteness, breathlessness, and death, provides one of the most powerful rationales for an understanding how words and images relate to one another within the ut pictura poesis tradition—usually seen as an hostile encounter between rival forms of representation. The argument to this point has been made largely on critical interpretations enacted through close readings of a limited number of texts. Would a computer designed to recognize co-occurrences of words and assign those words to a “topic” based on the probability they would occur together also reveal a similar affiliation between stillness and death, muteness, even femininity?
Would a computer be able to ascertain stylistic and semantic differences between ekphrastic and non-ekphrastic texts and reliably classify them according to whether or not the subject of the poem is an aesthetic object or not? We tend to believe that there are no real differences between how we describe the natural world as opposed to how we describe visual representations of the natural world. We base this assumption on human, interpretive, close readings of poetic texts; however, there is the potential that a computer might recognize subtle differences as statistically significant when considering hundreds of poems at a time. If a classification program such as Mallet could reliably categorize texts according to ekphrastic and non-ekphrastic, it is possible that we have missed something along the way.

In general, these are small questions constructed in such a way that there is a reasonable likelihood that we may get useful results. (I purposefully choose the word results instead of answers, because none of these would be answers. Instead the result of each study is designed to turn critics back to the texts with new questions.) And yet, how do we distinguish between useful results and something else? How do we know if it worked? Lots of money is spent trying to answer this question about big data, but what about these small and mid-sized data sets? Is there a threshold for how much data we need to be accurate and trustworthy? Can we actually develop standards for how much data we need to ask particular kinds of humanities questions to make relevant discoveries? In part, my project also addresses these questions, because otherwise, I can’t make convincing arguments about the humanities questions I’m asking.

Small projects (even mid-sized projects with mid-sized datasets) offer the promise of richly encoded data that can be tested, reorganized, and applied flexibly to a variety of contexts without potentially becoming the entirety of a project director’s career. The space between close, highly-supervised readings and distant, unsupervised analysis remains wide open as a field of study, and yet its potential value as a manageable, not wholly consuming, and reproducible option make it worth seriously considering. What exactly can be accomplished by small and mid-scale projects is largely unknown, but it may well be that small and mid-sized projects are where many scholars will find the most satisfying and useful results.

The post Small Projects & Limited Datasets appeared first on Maryland Institute for Technology in the Humanities.

Chasing the Great Data Whale

Lisa Rhody — Fri, 02 Mar 2012 14:10:47 +0000

The first thing you hear, or at least that you should hear, when you present an idea for a digital humanities project to someone already familiar with the field is this: “That’s great!

[pause] What does your data set look like?” Actually, that’s the reaction you’ll get if whoever you’re talking to is taking you seriously, so the reaction is a mixed blessing. On the one hand, you have their attention. On the other, they know enough to point out that projects without data go nowhere, and good data (not necessarily synonomous with big data) is hard to find. Data is truly the Moby Dick of the digital humanities. None is ever big enough, clean enough, or well-structured enough to achieve precisely what it is that researchers would like to achieve. Just when you (and more likely your team) feel confident that the data set is “harpooned” (structured, refined, aptly-tagged, and curated), the whale takes off again with a new standard: interoperabilty. It doesn’t play well with other data, and the chase begins again. When your data set works for you, that has some value, but when your data set works with others, well, that means a wider audience and a broader impact for your work.

Most projects have lifespans determined by fellowships or grants or sabbaticals, and unlike Captain Ahab, we can’t afford to have our hard work dragged out into the abyss and lost. The DH mantra may well be “project or perish.” Hard decisions about data are often determined by two factors: intellectual value and time. First, your data needs to be thoughtfully selected, described (tagged with metadata) and clean enough in order to work and to reasonably make the argument that what you’ve done with the data can be trusted. However, you need to know when it’s time to cut the rope and release what might be done, a choice between good-enough and great. When just a few more hours of tagging, a few more weeks of correcting OCR errors seem just within your grasp, the choice feels mercinary. Deciding to stop improving your data, though, is like the difference between Faulkner and Melville: “you must kill all your darlings.” Data sets, really, are Modernist objects: they are “what will suffice.”

This is the lesson I learned during my first full month at MITH. As a Winnemore Dissertation Fellow, I have approximately four months to capitalize on MITH’s resources and to produce a project that has a strong (but not perfect) data set, answers relevant humanities questions, and possesses enough sustainability, public outreach, and external support to become a viable, fundable project in the future. In these first five weeks, I have benefitted from the wealth of experience and depth of knowledge that the team assigned to my project possess. Jennifer Guiliano knows how to pull projects together, shepherding them through the steps of data management, publicity, and sustainability. Taking to heart her sage wisdom about managing successful DH projects, I feel that I have a much stronger grasp on what steps must be taken now and what can and must happen later, professional development knowledge that more graduate students should have when they venture into the alt-ac or tenure track market. Trevor Muñoz’s expertise with data sustainability prompted questions that have helped shape a future to my project even at it’s earliest outset—few graduate students have the time or the resources to think about how adequate curation in the short term could mean greater benefit in the long term. Amanda Visconti and Kristin Keister have been helping me to shape a public portal for my work, and I know that the virtual presence they help create will lend to the future success of the project, as well as its intellectual value.

The most salient lessons I’ve learned about the lure of the great data whale, however, have been from Travis Brown. I arrived in January with a large, but unrefined dataset of approximately 4,500 poems in a less-than-uniform collection of spreadsheets. I had a basic understanding of what I wanted the data to look like, but in consultation, Travis helped me to determine what would work better and might last longer. Travis created a data repository on MITH’s space with a version control system called “git.” (Julie Meloni’s often-cited and useful “Gentle Introduction to Version Control” on ProfHacker provides a useful explanation of what git is, why it’s valuable, and where to find resources if you’d like to try it.) Once I installed git on my machine, replicated the repository, and “pushed” (basically moved the data from my computer into the repository Travis created) the data to it, Travis could take a look. We agreed to separate the text of the poems and their titles from the descriptive information about it (author, date, publication, etc.) and to use a uniform identification number to name the file for each poem, and to track its descriptive data in a separate spreadsheet. We realized at that point that there were some duplicates, and in conversation agreed that we would keep the duplicates (in case there was a reason for them, such as minor textual differences) and tag them, so that later we could come back to examine them, but in the meantime not include them in the tests I’d like to run.

Then came the “darling killing.” Metadata, the information that describes the poetic texts that make up my data set, is necessary for the tests I’d like to run—those that classify and sort texts based on the likelihood that words co-appear in particular poems. The amount of metadata that I include to describe the poems will determine the kinds of tests and the reliability of the tests I hope to run. However, tagging 4,500 poems, even with simple identifiers, could take the whole four months of my fellowship if I let it. The hard choice was this: I would tag only the gender of the poet associated with each poem and whether or not the poem is ekphrastic (that is to say written about, to, or for work of visual art) or not or unknown. Some poems were easily tagged as ekphrastic, because I had sought them out specifically for that purpose; however, more often than not, I would need to make poem by poem determinations about the likely poetic subject. This takes time, and because of the tests I need to run to ask the questions I want to ask (eg. Can Mallet distinguish between descriptions of art and descriptions of nature?), I will need to let go of other useful, informative, helpful tags I would like to have, like the date each poem was published, the artwork to which it refers, and so on.

I am keeping record of all these things. The decision not to be perfect is the right choice, but it isn’t an easy one. I feel sometimes as though I have watched my whale slip from my grasp and sink back into the sea. My crew is safe. My project will continue on schedule, but not without the longing, nagging feeling that I need to return to this place again. Perfect data sets are a myth, one that often forms the barrier to new scholars who wish to begin DH projects. Rather than struggling for the perfect data set, I want to suggest that we place a much stronger emphasis on the more intellectual and more necessary component of data curation and data versions. I would argue that we judge projects not by the “completeness” or “perfection” of the data, but how well its versioning has been documented, how thoroughly curatorial decisions such as what to tag, when, and why have been publicized and described, and how much the evolution of the data contributes to the development of other projects within the community. Much the same way that we know more about the value of an article by how often it has been cited, we should value a digital humanities project by how much can be learned by the projects that follow it. In this sense, we should all be called Ishmael.

The post Chasing the Great Data Whale appeared first on Maryland Institute for Technology in the Humanities.