Readme File– Twitter Corpora about Climate CORPUS GENERAL INFO - Corpora were collected in 3 countries (France, Belgium, Norway) and in 3 populations types (media, politics, population) - The general corpora include retweets, but they can be easily removed through the filter options ALL FILES CAN BE OPEN in EXCEL: - First, Open Excel application. - Than data > import from CSV or text. - Options: UTF-8, semicolon, following 200 first lines - Just wait some dozen seconds and the file will get open CORPUS METADATA DESCRIPTION - id: this is the tweet’s identification number - date: this is the tweet’s emission date - segment: this is the population type: choice between population, media and politics. - party: if the tweet was written by a party member, it is the name of the party - user_name: the tweet’s author real name (has been anonymized for the population corpus) - user_screen_name: the tweet’s author’s Twitter account name (has been anonymized for the population corpus) - retweet: is the tweet a retweet? choice between YES and NO - full-text: the written content of the tweet - matches: the extraction term(s) that enabled to include this tweet in the Climate Corpus - hashtags: list of hashtags that appear in the tweet - mentions: list of other Twitter account(s) that appear in the tweet - urls: list of url(s) that appear in the tweet - medias: when the written tweet is associated to an image, this is the image’s original url (it may have disappeared since the emission date) RGPD - Nothing has been anonymized besides the population’s user name for data privacy reasons - No stigmatization has been made on types of population nor types of messages CORPUS NUMBERS - Belgium (FR + NL): 1,8M tweets (without RT: 925K) - France: 1M tweets (without RT: 529K) - Norway: 700K tweets (without RT: 445K)