Exploring Mangalyaan tweets with R

Mangalyaan is the spacecraft of Indian Space Research Orgnization’s Mars Orbiter Mission that entered the orbit of Mars last week. There were several tweets in Twitter with hashtag #Mangalyaan about it last week. I wanted to use R to explore those tweets. Tiger Analytics had done an interesting post on this topic last year when Mangalyaan launched. I found their analysis to infer topics particularly interesting. I do hope they repeat their analysis with the latest tweets. My goals and methods of analysis here are much more basic. I wanted to do the following:

  • Extract tweets containing the hashtag #Mangalyaan and create a word cloud
  • Attempt to find some topics/groupings from tweets
    • Try out R topicmodels package to infer toics
    • Find groupings using hierarchical clustering
    • Find groupings using graph community detection algorithms

The full code and explanation is in the following location. I was able to extract about 1000 tweets spanning 4 days from Sep 23, 2014 to Sep 26, 2014 and used it for the analysis below. All the analysis below should be viewed in the context that it is based on a small sample size. The word cloud of the frequent terms in the tweets is:
twWordcloud

Next, I used R package topic Models with number of topics set to 5 (no particular reason) and got the following result for top 10 words in each topictopics
I had done only basic preprocessing and ran the model with default parameters. Better preprocessing and model parameter tuning might give better results.

Applying hierarchical clustering on frequent terms gives the following grouping:hierClust

I found that igraph package has some easy to use functions for community detection and plotting. Here the co-occurence of words across tweets is used to construct a graph and the community detection algorithm is applied to that graph. These are plotted both as a dendrogram and a graph plot.
communityDendPlot
communityGraphPlot

In summary, this was a fun exercise. I got to learn a bit of the following R packages: twitteR, topicModels, igraph.

Advertisements
This entry was posted in R and tagged , , , , . Bookmark the permalink.

4 Responses to Exploring Mangalyaan tweets with R

  1. Hi, nice work! I wonder how do you draw the last two pictures with igraph? any codes? thx.

  2. Hi!
    I just came by your blog and it’s simply amazing!
    Full of well written and structured posts.
    I’ll keep reading here and wish you success!

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s