Data Tables and Fields¶
Data from tweets, users, media, and places are always stored in the PostgreSQL database. By default, they are available under the twitter
schema in the tables tweets
, users
, media
, and places
.
Tweets¶
All tweets are uniquely identified by their id
and event
fields. The following fields are also stored in the tweets
table.
Field |
Description |
---|---|
text |
Text of the tweet. If the tweet is a retweet, the text will be truncated at 140 characters. The full 280 characters can be retrieved by referring to the original tweet indicated in the |
lang |
BCP47 language tag, as inferred by Twitter |
author_id |
Tweet ID of the tweet’s author |
author_handle |
Handle of the tweet’s author |
created_at |
Time that the tweet was made |
conversation_id |
ID of the tweet’s conversation/reply thread |
possibly_sensitive |
Whether the tweet contains sensitive material, as determined by Twitter |
reply_settings |
Setting for which users were allowed to reply to the tweet |
source |
From where the tweet was posted |
author_follower_count |
Number of followers of the tweet’s author at time of collection |
retweet_count |
Number of retweets the tweet had at time of collection |
reply_count |
Number of replies the tweet had at time of collection |
like_count |
Number of likes the tweet had at time of collection |
quote_count |
Number of quotes the tweet had at time of collection |
replied_to |
Tweet ID of the tweet to which this tweet replied |
replied_to_author_id |
User ID of the user to which this tweet replied |
replied_to_handle |
User handle of the user to which this tweet replied |
replied_to_follower_count |
Number of followers of the user to which this tweet replied (at time of collection) |
quoted |
Tweet ID of the quoted tweet |
quoted_author_id |
User ID of the quoted user |
quoted_handle |
User handle of the quoted user |
quoted_follower_count |
Number of followers of the quoted user (at time of collection) |
retweeted |
Tweet ID of the retweeted tweet. Note: this is the ID of the original tweet, not the intermediary retweet |
retweeted_author_id |
User ID of the retweeted user |
retweeted_handle |
User handle of the retweeted user |
retweeted_follower_count |
Number of followers of the retweeted user (at time of collection) |
mentioned_author_ids |
List of user IDs of any users mentioned in the tweet |
mentioned_handles |
List of user handles of any users mentioned in the tweet |
hashtags |
List of any hashtags used in the tweet |
urls |
List of JSON objects containing information about any URLs used in the tweet |
media_keys |
List of media IDs of any media used in the tweet |
place_id |
ID of the place from which the tweet was sent, as determined by Twitter |
Users¶
All users are uniquely identified by their id
and event
fields. The following fields are also stored in the users
table.
Field |
Description |
---|---|
name |
Name of the user |
username |
Username of the user, also known as their handle or screen name |
created_at |
Time that the user’s account was made |
description |
Description of the user, also known as their bio |
location |
Self-reported location of the user, not necessarily a geographic place |
pinned_tweet_id |
Tweet ID of the user’s pinned tweet |
followers_count |
Number of accounts following the user at the time of collection |
following_count |
Number of accounts the user is following at the time of collection |
tweet_count |
Number of tweets by the user at the time of collection |
url |
Bio URL of the user |
profile_image_url |
URL of the user’s profile picture |
description_urls |
List of JSON objects of any URLs in the user’s bio |
description_hashtags |
List of hashtags in the user’s bio |
description_mentions |
List of user IDs of any users mentioned in the user’s bio |
verified |
Whether the user is verified |
Media¶
All media are uniquely identified by their id
and event
fields. The following fields are also stored in the media
table.
Field |
Description |
---|---|
type |
The type of media |
duration_ms |
If a video, the length of the video in milliseconds |
height |
Height of the media |
width |
Width of the media |
preview_image_url |
If a video, the URL of the preview image. Note, this is not the URL to an image if if the media is an image |
view_count |
If a video, the number of views it had at the time of collection |
Places¶
All places are uniquely identified by their id
field. Place objects returned by the API are assumed to be static and so they do not have an event
field. The following fields are also stored in the places
table.
Field |
Description |
---|---|
name |
Short name of the place |
full_name |
Full name of the place |
country |
Full name of the country where the place is |
country_code |
ISO Alpha-2 country code of where the place is |
geo |
A GeoJSON object with the place’s coordinates |
place_type |
Type of the place (e.g. city), as determined by Twitter |