I came across this really interesting data today that speaks directly to everything we’ve been discussing in class: Twitter, topic modeling, word clouds, data collection. This site categorizes tweets in New York City based on the language that is being tweeted in. Here, we are given our topics (language), but it also gives us an idea of what the ethnic makeup and diversity of neighborhoods is within the five boroughs. Imagine it as a word cloud of what makes up the city!
Having lived in NYC for 10 years prior to moving to DC last year, I find this extremely enlightening, especially given that I was always told Queens (where I lived) was the most diverse county in the entire United States. Judging from this, Manhattan is way more diverse, at least in terms of languages spoken. I’m also shocked that Chinese isn’t one of the languages aggregated by this site, as there is a very large Chinese population in Flushing, Queens. Additionally, I lived in Astoria, Queens, which has a large Greek community. Prior to seeing this data, had I done an exercise similar to the farmer’s market exercise we did last night, I would have included topics/languages not seen here.
Make sure to zoom in and out from the streets and also use the roads-to-black scroll bar at the bottom for optimum choices. There is also a view of London.