Data, Maps, Usability, and Performance

Grab Twitter Results with PHP and Remove Retweets and Duplicates

Last updated on July 9, 2012 in Development

PHP script to grab twitter search results

A twitter search for many keywords gives us interesting results but the default search results are filled with retweets and duplicates (or tweets that are very similar). So, I wanted to share some php code that will leverage the Twitter API and grab tweets via a search query but parse the results through some functions that remove similar tweets, retweets, and transform link text into actual links.

To start, retrieving twitter search results is pretty straightforward, we don’t even need an API key, and there is a filter parameter that you can pass to remove retweets in the API: filter:retweets

So, let’s start by grabbing a GET paramater called term and if it’s passed in the url, making a curl request for that search term and filtering out the retweets from the json response. Next we send the json response to the getResults function which parses the json and gives us the actual tweets. It also sends the tweet text to the autolink function which uses regex to transform link text to actual links.

Finally, we use the similar_text php function in my removeSimilar function to calculate the similarity between two strings and return only non-similar tweets (remove duplicates and similar looking tweets). Here is the code:

And, here is the Demo of Twitter Search Results for the search term GitHub.


Facebook Twitter Hacker News Reddit More...