Readme File– Twitter Corpora about Climate
CORPUS GENERAL INFO
-	Corpora were collected in 3 countries (France, Belgium, Norway) and in 3 populations types (media, politics, population)
-	The general corpora include retweets, but they can be easily removed through the filter options
ALL FILES CAN BE OPEN in EXCEL:
-	First, Open Excel application.
-	Than data > import from CSV or text.
-	Options: UTF-8, semicolon, following 200 first lines
-	Just wait some dozen seconds and the file will get open
CORPUS METADATA DESCRIPTION
-	id: this is the tweet’s identification number
-	date: this is the tweet’s emission date
-	segment: this is the population type: choice between population, media and politics. 
-	party: if the tweet was written by a party member, it is the name of the party
-	user_name: the tweet’s author real name (has been anonymized for the population corpus)
-	user_screen_name: the tweet’s author’s Twitter account name (has been anonymized for the population corpus)
-	retweet: is the tweet a retweet? choice between YES and NO
-	full-text: the written content of the tweet
-	matches: the extraction term(s) that enabled to include this tweet in the Climate Corpus
-	hashtags: list of hashtags that appear in the tweet
-	mentions: list of other Twitter account(s) that appear in the tweet 
-	urls: list of url(s) that appear in the tweet
-	medias: when the written tweet is associated to an image, this is the image’s original url (it may have disappeared since the emission date)
RGPD
-	Nothing has been anonymized besides the population’s user name for data privacy reasons
-	No stigmatization has been made on types of population nor types of messages
CORPUS NUMBERS
-	Belgium (FR + NL): 1,8M tweets (without RT: 925K)
-	France: 1M tweets (without RT: 529K)
-	Norway: 700K tweets (without RT: 445K)