Public Writing Audit #1

I decided to use the first Public Writing Audit as a chance to do some Storifying. I’ve seen people use Storify, but I don’t think I ever quite understood how it actually worked. Going through all my tweets from this semester, which required a fair bit of searching through twitter with Storify’s functions, made me realize that this can be a tedious and time-consuming process. But I also think it holds great potential.

My Storify is organized around DH Events, What I’m Reading, PDA 2013, #TransformDH, ENGL 668K exercises and tweets, and Miscellaneous. It’s several pages long, so be prepared. It’s here:

http://storify.com/MelissaRogers17/668k-public-writing#publicize

Going through these tweets, I realized a few things.

1) I could create a Storify of all my tweets related to Star Trek and chronicle the exact dates and times I was watching certain episodes!! This would be fun but it would also reveal (to a number of eminent academic twitterists, including my adviser) just how much time I spend watching Star Trek….a lot. But, it would also be a fun way to write that mini-essay on Star Trek that’s just been itching to come out–I could annotate all my Star Trek tweets, which are usually quotes or “#wisdom” from my favorite characters, delightfully out of context. But you’re not here to read about Star Trek, so moving on….

2) A chronology of tweets reveals something different than a Storify of tweets organized around certain topics. I thought that first I would just put all the DH-related tweets in chronologically, but my immediate inclination while doing this was to group tweets together. Topics started to emerge. “#TransformDH” and “What I’m Reading” are the biggest categories. “Miscellaneous” tends to contain my (and others’) twitter snark.  And there are usually clumps of tweets showing up right before and during our class time for ENGL 668K. BUT, these topics can also be plugged into a timeline. Rearranging them and putting them chronologically in order WITHIN the topics felt like putting together a narrative puzzle–”Oh right, that was the night I stayed up till 3am…twittering. Oh right, that Saturday six amazing conferences were going on at the same time…while I was home writing.” It made me think about my own story (and chronology) differently, which is exciting for a n4velg4zer errrrrrr ahem, autobiographer.

3) I thought my “public writing” on Twitter would be ME, writing, publicly. However, I’m a big retweeter. In fact, as I once observed on Twitter, retweeting is my jam. It often frustrates me to tweet from my computer as opposed to my phone, because my phone app allows me to easily quote other people’s tweets when retweeting, while I haven’t quite figured out how to do this on the computer. [Suggestions for useful PC twitter apps are welcome.] Often I retweet folks sharing snippets of events that I couldn’t attend, or pithy twitter poetics that are probably very decontextualized. So, I had to make a Storify decision as to whether I wanted this story to be just about/by me–impossible in the public writing context of Twitter.

4) I’m twitterpaited. The description on my Twitter profile, which I created more than a year and a half ago, suggests that I try not to be. And I really did try hard not to like Twitter. But then I found myself falling into Twitter holes all over the place, discovering all kinds of things. In fact, it has changed the way I do research. For example, I have a list entitled “Zine Love” that enables me to see all my zine-lovers’ tweets about things, zine-related and not. I also have an “Academia” list that enables me to see all tweets from the theory badasses I follow. But the problem with this is that I can’t be on Twitter all the time–it’s impossible. So as twitterpaited as I am, I will never be able to see ALL THE TWEETS. I have to be resigned to dipping into the stream when I have the time. [Which doesn't mean I resist the urge to endlessly scroll until I've caught up on all the action that happened since the last time I took a dip.]

I wanted to add up the total number of characters I tweeted, or figure out how many words it actually was since the Twitter gauntlet was thrown down early on in the course (What is a “significant” number of tweets? Is it the quality of the tweets or the quantity? What subject matter counts as public writing for this course?), but the Storify exhausted me. It will have to do on its own.

In terms of my blog posts for this class and for my own collaborative project, SqueakyWheelCollective.wordpress.com, the word count totals at least 4,573 (not including our exercises or words written after this sentence).

Dang. Now if I could only write that many words for that draft due in two weeks….

Stay tuned for the text of my Personal Digital Archiving Talk, “Public Displays of Affection: Digital Zine Archives and the Labor of Love.” [heh, c wut i did there? PDA, snicker snicker.]

 

Dinner before Folger

Hi Everyone!

Just wanted to extend a dinner invite to anyone who’s interested – a few of us are eating at Nando’s Peri-Peri in Chinatown before going to the Folger Library tomorrow evening (since we’ll have to switch Metro lines anyway). If you’d like to join us, we’re planning on getting there at 5:45pm or so. We can eat, then take the Red line to Union Station. If you’d like to coordinate closer, feel free to email me, and I can give you my cell #. See you all tomorrow night!

- Charity

Paper Machines Update

Here is a suggested fix for Paper Machines that has now worked in two cases, which gives me some confidence that it’s generally useful. Thanks to Courtney Wells, who volunteered her MacBook to test this fix. These instructions apply to Zotero running in Firefox (not standalone Zotero), and to Mac OS X systems. Here’s the summary:

  • Uninstall Paper Machines
  • Install Python 2.7.3
  • Reinstall Paper Machines
  • Make sure that the Path to Python executable is /usr/local/bin/python
  • Quit and restart Firefox

Here are the details:

  • Click Tools on the Firefox menu bar, and then select Add-ons. This gets you to the Add-ons Manager. Find Paper Machines 0.3.6, and click on the Remove button.
  • Download Python by clicking here. Open your Downloads folder and you should see a disk image file named python-2.7.3-macosx10.6.dmg. Double click on it, and when it opens, double click on Python.mpkg. You will then be led through the installation of Python.
  • Reinstall Paper Machines by clicking here in Firefox.
  • Open Zotero by clicking on the Zotero logo at the lower right of Firefox. Control click on TextHeap (under ENGL668K in Group Libraries on the lower left) and select Paper Machines Preferences… at the bottom of the contextual menu. Make sure you’re on the General Settings tab, and change the Path to Python executable from /usr/bin/pythonw to /usr/local/bin/python (see below).Paper Machines Preferences
  • Quit and restart Firefox. Open Zotero and find TextHeap. Control click on TextHeap and select Extract Text for Paper Machines. When text extraction runs to completion, you should be able to use other Paper Machines functionality (Word Cloud, Topic Modeling, etc.)

Please let me know if I’ve explained anything incompletely, or if this fix doesn’t work for you.

How Software Affects Writers

I was reading this recent article in The New Yorker digital edition and was struck by how apropos it is to our class.  In his article “Structure,” non-fiction author John McPhee talks about how important structure is to writing a story, but most interesting to us digital humanists, talks about how a software program called Kedit (kay edit) profoundly changed the way he structures and lays out his stories.  Even when newer software became available, he stuck to his anachronistic version because it was what he knew.

“Structure” strikes me as incredibly interesting for DHers because McPhee was writing when personal computers became available, and how he documents the way it impacted his work.  I found it to be one of the first instances of the digital changing the humanities.  Of course, nowadays, almost every author writes using some kind of word-processor, but McPhee had to switch from his typewriter to Kedit.  It’s very enlightening to those of use who have always almost had the use of a personal computer.

McPhee Structure

Experimentation, Machines, and The End of the World

This might just be me taking any excuse I can find to talk about an author I like, but the writings of Stanislaw Lem do seem relevant to DH work. Specifically, “How The World Was Saved” applies well to the principles of something like text analysis or topic modeling.

In the story, Trurl (a robotic engineer/constructor) builds a machine that could create anything starting with the letter N. Already this enacts a constraint, from which Trurl can create and build. Trurl must learn to work within this limit (and it is quickly established that there are no shortcuts to get outside of the machine’s programming). The difference, of course, is that what this machine creates also exists in the real world. In other word, by limiting its contents (or data) to a finite number, the programming of the machine enacts a “deformance” on the universe, as if it were a text.

We can also relate Trurl’s machine to the theory vs methodology debate. Trurl doesn’t approach the machine with a specific question he needs answering or an end result he would like to achieve. Instead, Trurl experiments, feeding the machine words and testing the limits of its programming. To relate this to Fish, Trurl does not begin with an interpretive hypothesis that needs answering, as is typical in the humanities. Likewise, he doesn’t apply this hypothesis to the machine’s results and look for a formal pattern. In DH, the direction is reversed (witnessing a formal pattern and then formulating an interpretive hypothesis). Trurl takes the DH approach.

Unfortunately for Trurl, the word he uses to reach this point, “Nothing,” nearly destroys the universe. But what is interesting is how the machine responds to Trurl’s command:

Had I made Nothing outright, in one fell swoop, everything would have ceased to exist, and that includes Trurl, the sky, the Universe, and you – and even myself. In which case who could say and to whom could it be said that the order was carried out and I am an efficient and capable machine?

Trurl’s experiment with “Nothing,” which unfortunately eliminates all of those “worches” and “zits,” can be considered the outlier in the machine’s results. It represents a moment in which, in order to follow both its programming and its command, the machine must alter its methodology somewhat (not to mention the universe itself). This might be the type of result we look for when running topic models or text analyses: the ones that confront what the tools struggles or fails to do or even strains itself doing.

Of course, Trurl’s machine is just science-fiction and doesn’t technically exist, but I wouldn’t mind experimenting with it myself.

(Side Note: Another of Lem’s stories, “Trurl’s Electronic Bard,” could have easily been the subject of another blog post, applying particularly well to our discussion in the first few weeks. But don’t worry, I’ll spare you the long-winded essay . . . for now.)

Twitter & Topic Modeling

I came across this really interesting data today that speaks directly to everything we’ve been discussing in class: Twitter, topic modeling, word clouds, data collection.  This site categorizes tweets in New York City based on the language that is being tweeted in.  Here, we are given our topics (language), but it also gives us an idea of what the ethnic makeup and diversity of neighborhoods is within the five boroughs.  Imagine it as a word cloud of what makes up the city!

Having lived in NYC for 10 years prior to moving to DC last year, I find this extremely enlightening, especially given that I was always told Queens (where I lived) was the most diverse county in the entire United States.  Judging from this, Manhattan is way more diverse, at least in terms of languages spoken.  I’m also shocked that Chinese isn’t one of the languages aggregated by this site, as there is a very large Chinese population in Flushing, Queens.  Additionally, I lived in Astoria, Queens, which has a large Greek community.  Prior to seeing this data, had I done an exercise similar to the farmer’s market exercise we did last night, I would have included topics/languages not seen here.

Make sure to zoom in and out from the streets and also use the roads-to-black scroll bar at the bottom for optimum choices.  There is also a view of London.

http://ny.spatial.ly/

NYC Tweets

Still having problems with Paper Machines?

We don’t really get to declare independence from Travis Brown’s technical support until we have Paper Machines actually running on our laptops. I’ve got everything working except topic modeling, and I think a lot of other people are in the same boat (although I know that Katie did manage to get it working last night). I’ve identified a Paper Machines bug report that may be the same problem we’re having (https://github.com/chrisjr/papermachines/issues/12), but I need more information to be sure. In the interest of getting topic modeling using Paper Machines working for myself (because I actually intend to use it), I’m volunteering to be the point person for debugging this issue. I think (to borrow from Mark Sample) this definitely counts as service, not scholarship.

Please comment on this post with your operating system, Java, and Python version information, plus whether or not Paper Machines is working for you. For operating system information on a Mac, click on the Apple in the top menu bar, and select “About This Mac”. For version information about Java and Python, you’ll need to open the Terminal application and then type “java -version” and “python -V” on the command line. I’m hoping that a Windows person in the class can supplement this with information on how to do the corresponding operations in that environment: since the last Windows OS that I was familiar with was XP Pro, I don’t think that anything I’ll have to say on the subject will be useful.

As an example, the OS on my laptop is Mac OS X Version 10.6.8, the Java version is 1.6.0_39, and the Python version is 2.6.1. I’ve got Paper Machine working except for topic modeling. I’d appreciate hearing from as many of you as possible, including those who have it working (because I need all the data points to see what the non-working installations have in common). Thanks.

Paper Machines???

Has anyone tried to run Paper Machines? I have downloaded all the pre-req’s and I know it’s installed (my Firefox just updated and prompted me to review my add-ons – both Zotero and Paper Machines appeared in the list), but I don’t know how to initiate it in Zotero. The directions on GitHub are very sparse:

To begin, right-click (control-click for Mac) on the collection you wish to analyze and select “Extract Texts for Paper Machines.” Once the extraction process is complete, this right-click menu will offer several different processes that may be run on a collection, each with an accompanying visualization. Once these processes have been run, selecting “Export Output of Paper Machines…” will allow you to choose which visualizations to export.

When I right-click on a collection, no such option appears. This is what I see, even with all options investigated:

Screen Shot 2013-02-19 at 5.15.30 PM

Anyone else have any success?

Sherlock Holmes Would Have Been a DHer

Alright, it is time for a super geeky confession: I belong to a Sherlock Holmes society.  At the last meeting a number asked me what I was studying and I tried to explain Digital Humanities to them.  It wasn’t, shall we say, the greatest success.  So I’ve been thinking, at one of our next meetings maybe I’ll finally give the presentation–a duty I’ve shirked for all of the 10 years I’ve belonged to the club.  I was trying to think of ways to blend DH with Sherlock Holmes and show how even the most basic of DH tools might be useful when understanding the Sherlock Holmes stories.

Well, the work for this coming week to find a library of sorts related to our texts started me thinking about the similarities between Dracula and Sherlock Holmes–and the men responsible for their creation.  Both authors considered themselves to be the epitome of the Victorian gentleman–upholding the beliefs fundamental to that image.  As such, wouldn’t they have a tendency to choose from the same offerings of the LDA Buffet?  Some additions of Dracula, such as my Project Gutenberg copy, even bill it as “A Mystery Story.”  Would the two men’s word choice reflect this similarity in experience and ideal?

Dracula Word Cloud Sherlock Holmes Wordle

I tried doing the Holmes word cloud with one text–Hound of the Baskervilles–but the names like Baskerville and Henry started to dominate so much so that one couldn’t see much of the other language, so to balance it out I stuck as much of the Sherlockian Canon as I could find into Wordle the resulting “footprint,” if I may so call it, seems more representative of Sir Arthur Conan Doyle’s writing as was the goal.  And judging by the results, it would seem that the two do share a similarity in word choice.  Words like “man,” “know,” “must,” “may,” “light,” “night,” and so on all have strong followings in the clouds.

Now, I’ve often heard said of Doyle that he was not a terribly good writer and that he, instead, had the good fortune to create a character who was original and fascinating enough to come to life in spite of this less than fortuitous entrance into the world.  Holmes captured the imagination of the readers in spite of Doyle’s talent rather than because of it.  Could the same be true of Dracula seeing the linguistic similarities between their authors?  I’m not entirely sure how to test this particular theory–maybe someone else will be able to suggest one–but I thought I could test how the popularity of the characters of Dracula and Holmes have compared to that of their creators.  The idea being that if Holmes and Dracula and their creators shared the limelight it would suggest that there was as much to be said about the creation as the creator.  Doyle and Stoker would be as interesting as authors as their creations were as literary characters.  The result is as follows:

Screen Shot 2013-02-14 at 6.43.00 PMGoogle’s Ngram Viewer would seem to support this theory.  The characters have survived far better than their creators–in fact, Holmes leaps to the forefront from the instant of his creation (Dracula has a bit more of an uphill battle at first).  But maybe this is to be expected?  Do characters always do better than their creators?  If so, let’s test on an undeniably talented author and their beloved creation, Jane Austen and Darcy:

Screen Shot 2013-02-14 at 6.44.54 PM

Now, the one problem with the above, is that it doesn’t take into consideration that Darcy is rarely called by his full name and has a very common one, at that, unlike Holmes and Dracula.  So, here is the above result modified with the revision of “Mr. Darcy” rather than simply “Darcy.”  It is not ideal, how often does, when writing about Austen’s ideal man, so formally refer to him as “Mr. Darcy.”  But, one should at least be able to mentally average the two results to attain some sense of our Darcy’s popularity in English writing:

Screen Shot 2013-02-14 at 6.45.08 PMSo clearly, this is not true among all authors and their creations.  Austen gives Darcy a run for his money.  Now, one must also take into account that Austen published far more texts than Stoker or Doyle.  Her’s were also far more popular–anyone heard of or remember The White Company? No? That suggests to me that Doyle’s talent with the written word is not as strong Holmes’s persistance in the memory.

Further, this research suggests that Stoker and Dracula shared a similar relationship with their fictional creations and made similar word choices.  We can’t definitively prove that Stoker and Doyle were particularly terrible writers, but the results suggest that other writers do not stand in the shadows while their creations take the limelight as these two do.

As a final note: the class discussion of anime reminded me of a statistic I read long ago that stated that there were more Sherlock Holmes societies in Japan than their were in the UK.  As it turns out, according to the list of active Sherlockian societies kept by Peter E. Blau (a member of the Baker Street Irregulars, the most illustrious Sherlockian society), Japan has 15 societies while the UK has 16.  Still, the figure is impressive and made me curious how Holmes’ popularity (and Dracula’s) compared by geographical region and language.  Alas, I don’t know how to translate Holmes into Japanese or Russian (there is a large following there as well) so I’m limited to American and British English for Google’s Ngram Viewer.  However, the results were still fascinating:

I find it fascinating that to the Americans, Holmes's popularity grew far more rapidly than in England, yet once again, the vampire steals the show.

I find it fascinating that to the Americans, Holmes’s popularity grew far more rapidly than in England, yet once again, the vampire steals the show.

It would seem that while Holmes was very popular in the UK since his creation, Dracula has recently stolen center stage--in spite of all the latest Sherlock re-imaginings.

It would seem that while Holmes was very popular in the UK since his creation, Dracula has recently stolen center stage–in spite of all the latest Sherlock re-imaginings.

In conclusion, I think Holmes would have been a DHer.  The man who cried, “Data! Data! Data! [....] I can’t make bricks without clay,” would have appreciated the way in which DH offers one tremendous information at one’s fingertips and the tools to make sense of it.  Holmes would especially have to appreciate the fact that the methods of the Digital Humanities could be used to catch our own Napoleon of Crime, so to speak, Osama bin Laden.  And as for Dracula?  Well, clearly DH has brought him out into the light of day.