Monday, April 15, 2013

Unshorten URLs in R

Well, of course, this tip comes out one week after I needed it. The author uses the RCurl package to request the header of the shortened URL and then parse the "location" parameter on the return. This sort of operation tends to be needed frequently, especially when using data from twitter. Twitter now shortens even already shortened links using their own service. Every link in every tweet now has to be put through a process like this to resolve it to the full url.

My solution is very similar to the RLangTip, but instead of using RCurl, I am using a system call to "curl", and repeatedly requesting the header for each url returned until no location attribute is found... and that's the final url. It's a little ugly, and I'm sure it can be sped up, and improved upon, but it works well enough...

Find me on twitter...

No comments:

Post a Comment

Note: Only a member of this blog may post a comment.