This 25-million-word corpus was designed for large-scale quantitative analyses of political communication via Twitter. It contains tweets and replies sent from the official accounts of US politicians during the Trump presidency as well as earlier Tweets by the same politicians. It thus allows for comparisons between the communication styles and strategies employed by Trump and by other politicians as well as for studies of change in individual communication styles.
The following tweets are contained in the corpus:
- Trump: Tweets sent between the registration of his account and November 4th 2020.
- US Senators in office in November 2020: Tweets sent between the registration of their accounts and November 10th 2020.
- US Senators no longer in office in November 2020, but who were in office earlier during the Trump presidency: Tweets sent between the registration of their accounts and the date they left the senate.
- Tweets deleted by the users or Twitter are not contained in the corpus. Under certain conditions, the algorithm may skip tweets, therefore, some further tweets may be missing.
Size of subcorpora:
- Trump: 900,000 words (700,000 excluding quotes and hidden retweets)
- Republican senators: 12 million words
- Democratic senators: 12 million words
- Independent senators: 500,000 words