This segment of the project blog explains the way I pulled the twitter data. Although this might seem like a simple task at first, the twitter api throttles the number of requests you can make at a time. After around 4000 pulls at a time, the JSON request errors and returns a 402 error.
This was worked around by using a windows batch script with a wait time in between pulling data about different stocks. Twitter also prevents users from pulling tweets past 7 days old with the standard API key. As a result, I tried to pull many different stocks instead of a long history of one.
The code I used is below to write twitter data into a csv file
tweetpull.py
import tweepy
import csv
import sys
import datetime
consumer_key = "zgxU0sbLkEB93I3I3wLEDjRok"
consumer_secret = "nmV0Q8bNsNsFlKffVKbNy20vX2IAWNtzLbcHpNsZaoYAysgxfI"
access_token = "961470586827440128-EFWGLphaPXOUQkjCZNjzcdgXfmtTYcy"
access_token_secret = "TZFVLqznwo6lgITQlVk6T9lxIKaAimV7WShXVnYItnFYL"
auth = tweepy.OAuthHandler(consumer_key, consumer_secret)
auth.set_access_token(access_token, access_token_secret)
api = tweepy.API(auth)
# Open/Create a file to append data
csvFile = open('tweetdata/'+sys.argv[1]+'tweets.csv', 'a')
#Use csv Writer
fields = ('date','text', 'followers')
csvWriter = csv.writer(csvFile, lineterminator= '\n')
for tweet in tweepy.Cursor(api.search,q=sys.argv[1], lang="en", since_id="2018-2-23").items():
print(tweet.created_at, tweet.text)
follower_count = tweet.user.followers_count
#if tweet.created_at >= datetime.datetime(2019,2,24):
csvWriter.writerow([tweet.created_at, tweet.text.encode('utf-8'),follower_count])