I love to travel. And something that makes that a lot easier is cheap flights. There are many websites and even twitter accounts that exist only to point out cheap flights. For about a year I’ve followed several twitter accounts for fare deals and have alerts turned on so that I don’t miss something good. Currently, I get about 50 alerts a day that light up my phone and distract me from whatever I’m doing… but only one or two of those are actually from my home airport of DEN. So I created a python script to help me out.
So before we jump into the code, lets talk it through. Our program is going to connect to twitter using the handle DEN_flight_deal, which I created. Then every five minutes, it will check several twitter handles that often advertise cheap flights and if any of them mention Denver, it will retweet it. It also checks the direct messages everytime and if it receives a “test” message it responds so that I know it’s still running.
I’m using the Tweepy package, which appears to mostly be a wrapper around the Twitter API. It will handle all the hard stuff and allows this program to work in under 100 lines. I went on twitter and created the DEN_flight_deal account then registered an app at apps.twitter.com which allowed me to get the keys and tokens needed to authenticate with twitter. There are lots of tutorials on this out there but it’s really easy and you just need a name, description, and some website to put it.
So lets look at the first part of the code. It’s the matching functions for various cities. I plan to roll this out to other cities than denver so here are the matching functions. I could have used regular expressions – and probably should have – but this was quicker and works fine for now. Note that the code is sometimes messed up by wordpress so I replaced all of my “<0” with “==-1” but the greater than sign still shows up weird. Sorry.
# Find matches from Denver def DENmatches(s): # Find start of web address so that it isn't included in search # If it's not there (-1) then change stop to length of text stop = s.find("http", 0, len(s)) if stop == -1: stop = len(s) # See if terms are in string between 0 and stop # Checking to make sure sweeden and denmark aren't included by accident if s.upper().find("DEN", 0, stop) >= 0: if s.upper().find("SWEDEN", 0, stop) == -1 : if s.upper().find("DENMARK", 0, stop) == -1: return True # If the function makes it to here, then there are no mathes return False # Find matches from NYC def NYCmatches(s): # Find start of web address so that it isn't included in search # If it's not there (-1) then change stop to length of text stop = s.find("http", 0, len(s)) if stop == -1: stop = len(s) # See if terms are in string between 0 and stop if s.upper().find("NYC", 0, stop) >= 0: return True if s.upper().find("LGA", 0, stop) >= 0: return True if s.upper().find("JFK", 0, stop) >= 0: return True if s.upper().find("EWR", 0, stop) >= 0: return True if s.upper().find("New York City", 0, stop) >= 0:` return True if s.upper().find("Newark", 0, stop) >= 0: return True # If the function makes it to here, then there are no mathes return False # Find matches from Washington DC def WASmatches(s): # Find start of web address so that it isn't included in search # If it's not there (-1) then change stop to length of text stop = s.find("http", 0, len(s)) if stop == -1: stop = len(s) # See if terms are in string between 0 and stop if s.upper().find("DCA", 0, stop) >= 0: return True if s.upper().find("IAD", 0, stop) >= 0: return True if s.upper().find("BWI", 0, stop) >= 0: return True # For finding Washington, make sure it isn't talking about Seattle, Washington if s.upper().find("Washington", 0, stop) >= 0: if s.upper().find("Seatle", 0, stop) == -1 : return True # For finding DC, don't convert to upper if s.find("DC", 0, stop) >= 0: return True # If the function makes it to here, then there are no mathes return False # Find matches for OKC and Tulsa def OKmatches(s): # Find start of web address so that it isn't included in search # If it's not there (-1) then change stop to length of text stop = s.find("http", 0, len(s)) if stop == -1: stop = len(s) # See if terms are in string between 0 and stop if s.upper().find("OKC", 0, stop) >= 0: return True if s.upper().find("TUL", 0, stop) >= 0: return True if s.upper().find("Oklahoma", 0, stop) >= 0: return True if s.upper().find("Tulsa", 0, stop) >= 0: return True # If the function makes it to here, then there are no mathes return False # Find matches for San Diego def SANmatches(s): # Find start of web address so that it isn't included in search # If it's not there (-1) then change stop to length of text stop = s.find("http", 0, len(s)) if stop == -1: stop = len(s) # See if terms are in string between 0 and stop if s.upper().find("SAN", 0, stop) >= 0: return True if s.upper().find("San Diego", 0, stop) >= 0: return True # If the function makes it to here, then there are no mathes return False # Find matches for Dallas def DALmatches(s): # Find start of web address so that it isn't included in search # If it's not there (-1) then change stop to length of text stop = s.find("http", 0, len(s)) if stop == -1: stop = len(s) # See if terms are in string between 0 and stop if s.upper().find("DAL", 0, stop) >= 0: return True if s.upper().find("DFW", 0, stop) >= 0: return True # If the function makes it to here, then there are no mathes return False # Find matches from Houston def HOUmatches(s): # Find start of web address so that it isn't included in search # If it's not there (-1) then change stop to length of text stop = s.find("http", 0, len(s)) if stop == -1: stop = len(s) # See if terms are in string between 0 and stop if s.upper().find("Hou", 0, stop) >= 0: return True if s.upper().find("IAH", 0, stop) >= 0: return True # If the function makes it to here, then there are no mathes return False
Here, I’m using the find function a lot. I convert the text of the tweet to upper case then try to find the airport code or city name. If the text has the code in it, it will return the position in the text where it occurs; if it doesn’t it will return -1. There are times where the text might include the airport code by accident in the link at the bottom of the tweet when it’s been shortened. So I first find where the web address starts by finding “https” and then only searching from the beginning of the text to that point.
Now comes the real code:
</pre> import datetime import tweepy import time # List of twitter handles that might have deals for DEN handleList = "TheFlightDeal", "SecretFlying", "FareDealAlert", "airfarewatchdog", "Hopper_Deals", "DealsFromDEN" <span id="mce_SELREST_start" style="overflow:hidden;line-height:0;"></span> # These are the keys and tokens needed to authenticate with twitter consumer_key = "" consumer_secret = "" access_token = "" access_token_secret = "" # Authenticate with twitter using the keys and tokens auth = tweepy.OAuthHandler(consumer_key, consumer_secret) auth.set_access_token(access_token, access_token_secret) # Create api object for this account api = tweepy.API(auth) # Infinite Loop while (True): # Surround everything in a try/except so that if an error occurs, # it will only end that session and will not kill the entire program try: print "Currently " + str(datetime.datetime.utcnow()) + " UTC" # Gets the list of direct messages from twitter print "Checking messages..." messageList = api.direct_messages() print "There are: " + str(len(messageList)) + " messages" # Iterate through the messages for message in messageList: print "At " + str(message.created_at) + " UTC, " + str(message.sender.screen_name) + " said: " + str(message.text) # If the message is a test, respond with ACK and delete the message if message.text.upper().find('TEST', 0, len(message.text)) >= 0: print "Responding to test message with ACK" api.send_direct_message(message.sender.screen_name, text='ACK') print "Destroying received message" api.destroy_direct_message(message.id) # Iterate through each of the username/handles to check for deals for handle in handleList: # Load the handle as user and get timeline user=api.get_user(screen_name=handle) timeline = user.timeline() # Print status info print "Checking " + handle + " now ..." # Iterate through each tweet returned from their timeline (limit:20) for tweet in timeline: # Check to see if there is a match is in the tweet text if DENmatches(tweet.text) == True: # Check to see if the tweet occured in the last 6 minutes # Have to check both days and seconds to prevent errors, still not sure why # Another method would be to check if the tweet as already been retweeted but that's a bit harder timediff = datetime.datetime.utcnow() - datetime.datetime.strptime(str(tweet.created_at), "%Y-%m-%d %H:%M:%S") if (timediff.days < 1 and timediff.seconds < 6*60): # Print info about the tweet print "Found a match on " + handle + " at " + str(tweet.created_at) + " with ID: " + str(tweet.id) print "It was posted " + str(timediff) + " ago" print tweet.text print "Retweeting..." # Try/except to catch any twitter errors like retweeting multiple times # This is specifically here rather than using the outer try/except because # if there is an error, we don't want it to stop the other retweets in this # session because the next time it gets to them they will be too old try: tweet.retweet() except tweepy.TweepError as e: print(e) # Catch any exceptions and print it out except Exception as e: print e # Sleep for five minutes print "sleeping..." time.sleep(5*60) print "awake!"
Again, I appologize for the crappy stuff wordpress does to the code. I converted all of my expressions to use a greater-than rather than less-than so prevent most of the problems but it's still not perfect. I'm not going to go over the code too much because I think it's commented pretty well. Key things here are use of lists, try/exceptions, and the deltatime.
Right now there is a separate program for each city but I may change that in the future so that it's combined into one program. I'd like to let it run for a few months first. 🙂
You can download the full code here. If you have any questions or comments, put them below and I’ll try to get back with ya. Thanks!