By Udisha Alok
Social media is a veritable gold mine of information and a window into the collective psychology of people across the world. Be it politicians, celebrities, creative artists, professors or students - everyone seems to be on Twitter.
It has become increasingly popular with tweets from famous personalities influencing millions of followers and the markets too!
So Twitter data is used for sentiment analysis in various spheres including trading.
This blog will show how we can fetch data from Twitter using the Twitter API. We will use the Tweepy library for it and explore in detail the various types of data we can get using it. This blog is the first of a two-part mini-series.
Let us see what we will cover in this blog.
- What is Twitter API?
- Twitter API v2 access levels
- Getting access to the Twitter API
- Essential access vs Elevated access
- Tips for elevated access application
- Tweepy for fetching data from Twitter using Python
- Installing Tweepy
- API Authentication
- Get user id and screen name
- Get user info
- Get follower count
- Get tweets from your timeline
- Get tweets by a user
- Filtering retweets and replies
- Search using pagination
- Get replies to a tweet
- Get tweets by a hashtag
- Get tweets by a keyword
- Combining multiple queries
What is Twitter API?
Twitter API is the official programmatic endpoint provided by Twitter. It allows developers to access the enormous amount of public data on Twitter that millions of users share daily.
The latest version of the Twitter API is v2, officially the primary Twitter API. However, Twitter API v1.1 is still in use.
There are some crucial differences between the two versions of the API. So, some of the functionalities from v1.1 may not work with v2.
As per the Twitter website, the v2 is built “to better serve a broader collection of needs, introduced new features and endpoints, and improved upon the developer experience.”
Please refer to this link for more information about the differences between the two versions.
Twitter API v2 access levels
Twitter API v2 offers different access levels: essential, elevated, and academic research. Elevated+ level is also coming soon, as per the official website.
Here is an image comparing some of the features available under each access level.
For some of the codes presented in this post, you need to have elevated access.
Getting access to the Twitter API
For using the Twitter API v2, you need:
- An approved developer account.
- Keys and tokens from a developer App located within a Project for authentication.
Read this official guide to learn more about getting started with the Twitter API.
Once you have signed up for the developer account, you have essential access immediately. Elevated access is needed to make requests to the Twitter v1.1 and for added functionalities of the v2. Essential access is available directly, while elevated access has to be requested.
Essential access vs Elevated access
Briefly, the difference between essential and elevated access is as follows:
No need to apply.
Requires an approved developer account application.
It usually takes 48-72 hours. However, it is possible that reviewing Elevated access applications may take up to two weeks
Free access up to 2M Tweets/month, and 3 App environments
1 App, 1 Project
3 Apps, 1 Project
You need to apply for elevated access from the developer portal for accessing all the codes discussed in this blog. The process is pretty straightforward.
Tips for elevated access application
Here are some handy tips that I found helpful to make the elevated access application get approved faster:
- Be honest in your application about the objective of your app and how you intend to use the data.
- Give detailed information about your project/app.
- If additional information is needed about any of your responses, Twitter will send you an email. Please respond to it candidly and with details as soon as possible.
- If your application gets rejected because you did not provide the information needed on time, then you can re-open that application by replying to the email thread from your inbox.
- If your application gets rejected because its use case violated the Twitter policy, you can not re-apply for it.
Refer to these FAQs for more details.
Once your application for elevated access is approved, you will receive an email at your registered email address.
Let’s get started with the code!
Tweepy for fetching data from Twitter using Python
Tweepy is an easy-to-use Python library for accessing the Twitter API. Its API class provides access to the RESTful methods of the Twitter API. This makes it easy to understand and learn, making it a popular choice for accessing the Twitter API functionalities.
To install Tweepy from PyPI using pip, you can use the following command:
pip install tweepy
Alternatively, you can install it from the Github repository:
pip install git+https://github.com/tweepy/tweepy.git
When you register for the developer account with Twitter, you get the API key and secret (these function like the username and password for your App), and the access token and secret (these represent the user that owns the App). You will need these to authenticate yourself before making requests to the Twitter API.
These keys and tokens do not expire unless regenerated and need to be saved securely so you can access them for your code without revealing them to someone else.
For this purpose, we will be creating a config file in which we will save these credentials. We will then read these values from the config file in our code.
Create the config file
- Open notepad (or any basic text editor), and create a file in the following format:
[twitter] api_key = XXXXXXXXXXXXXXXXXXXXXXXXX api_key_secret = XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX access_token = XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX access_token_secret = XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
2. Now, save this file with the name ‘config.ini’.
Read the credentials from the config file
We will now read the values of the credentials from the config file using the configparser library.
We will authenticate ourselves using the credentials, and then create the Tweepy API object that will allow us to access the RESTful methods provided by the Twitter API.
Get user id and screen name
Let us now get started with retrieving data from Twitter. Let us first get the user name for a particular user id and vice versa using the get_user method.
Get user info
Let us now try to get some information about the user for whom we fetched data in the code above.
Get follower count
You can get the timeline of a user using the timeline attribute, as shown below.
There are many more attributes available for the user. Feel free to explore them at leisure!
Get tweets from your timeline
You can fetch the tweets from your timeline using the home_timeline() method of the API class.
This method returns a tweepy.models.ResultSet object, which may not make much sense at a glance. Let us put the fields we are interested in into a pandas dataframe so that we can analyze it better.
The home_timeline() method returns the 20 most recent statuses, including retweets, posted by the authenticating user and that user’s friends. This is the equivalent of /timeline/home on the Web.
Get tweets by a user
Let us now see how we can get the tweets by a particular user. This can be done using the user_timeline() method. Similar to the home_timeline() method, this method returns the 20 most recent statuses posted from the authenticating user or the user specified.
It’s also possible to request another user’s timeline via the id parameter.
If you want to fetch more than 20 tweets, you can use the count parameter.
Filtering retweets and replies
You can exclude replies and include retweets using the exclude_replies and include_rts flags.
API object allows you to pull a maximum of 200 tweets. Even if you specify count as 300, you will not be able to get more than 200 tweets! Then what do we do? We will deal with this in the next section using pagination.
Search using pagination
If you want to paginate through the search results, you need to use the Cursor object from the Tweepy library. It can be used to paginate through all the API methods that support pagination.
The syntax for the constructor for the Cursor class is:
class tweepy.Cursor(method, *args, **kwargs)
method – API method to paginate for
args – Positional arguments to pass to method
kwargs – Keyword arguments to pass to method
If you want only ‘n’ items or pages returned, you can passthe limit you want to impose into the items() or pages() methods.
Here is an example with the number of items we want to fetch using items() method.
Let us now fetch the number of pages we want to fetch using pages() method.
We can use the Cursor to fetch more than 200 results, as limited by the API object.
Get replies to a tweet
For getting replies to a tweet, we use the attribute in_reply_to_status_id_str to filter the replies to a particular tweet.
Get tweets by a keyword
We can search for keywords or hashtags using the ‘q’ parameter.
Get tweets by a hashtag
Combining multiple queries
We can combine two or more queries by simply concatenating the search queries. This is similar to how we search on Google.
In this blog, we have covered the details about the different versions of the Twitter API, the different access levels, and how to get them. We learnt about the Tweepy library and how to install it.
We also explored how we can use the API interface provided by Tweepy to fetch data about a user or tweet from Twitter, how to use Cursors for pagination, and how to search for tweets using hashtags, keywords, and a combination of queries.
In the second part of this blog, we will explore some other aspects of the API interface like the premium search feature and the rate limits. We will also play around with the client interface of the Tweepy library that works with the Twitter API v2.0.
Want to harness alternate sources of data, to quantify human sentiments expressed in news and tweets using machine learning techniques? Check out this course on Trading Strategies with News and Tweets. You can use sentiment indicators and sentiment scores to create trading strategy and implement the same in live trading.
Till then, happy tweeting and Pythoning!
Disclaimer: All investments and trading in the stock market involve risk. Any decisions to place trades in the financial markets, including trading in stock or options or other financial instruments is a personal decision that should only be made after thorough research, including a personal risk and financial assessment and the engagement of professional assistance to the extent you believe necessary. The trading strategies or related information mentioned in this article is for informational purposes only.