• No se han encontrado resultados

una Convención Internacional sobre Desapariciones Forzadas [con] base en el proyecto radicado en el

D) Costas y gastos

This section aims to show the proposed system’s results and reliability by examining the correlation between rainfall data and Twitter messages in a specific location. This correlation

140

has been evaluated by conducting a case study in the city of Makah, Saudi Arabia, using a dataset of rainfall data provided by NOAA. During the period from 05 May to 01 Jun 2017, around 10,454 tweets mentioning high risk floods were collected. The proposed system inferred the location for 6018 tweets; 2082 tweets were located within the Makah region (geocode coordinates located within rectangle bounded by (22.00,39.60) from the northwest and (21.04,41.59) from the southeast.

Figure 8-3 and Figure 8-4 show the daily rainfall amount for Arabian Peninsula, and the collected tweets were classified as positive (related to high floods) or negative (does not mention floods). Figure 8-3 and Figure 8-4 show that positive relationships were observed between rainfall amount and the number of tweets in the positive class. Moreover, it is clear that the number of tweets classified as negative increased, and the number of positive tweets decreased when the rainfall amount decreased, as recorded from 25 May until 01 Jun 2017. This indicator may be considered as evidence of the classifier’s performance, taking into account that desert is the most prominent feature of the Arabian Peninsula, and more than half the area of Saudi Arabia is desert; as a result of that, most floods occur in desert and might not be observed by Twitter users. For that reason, an urban area (Makkah region, Saudi Arabia) has been adopted to obtain clear results.

141

Figure 8-5 depicts, to a limited extent, the increase in rainfall and the number of tweets that mentioned high risk floods during the period 05 May - 01 Jun 2017, in Makah, Saudi Arabia, during the peak of the rainy season. As can be seen, there are similarities (based on the peaks) between the rainfall time-series and tweet time-series. However, there is not an exact correspondence between the time-series, which is due to the fact that floods occur after the rain, and some floods occur as a result of rain in uninhabited places such as mountains.

Figure 8-4 number of tweets by class during time period between 5 May and 1 Jun 2017

Figure 8-5 daily rainfall amount received in Makkah region (above), and number of tweets are located in Makkah region (below) during time period between 5 May and 1 Jun 2017.

142

It turns out that the indicators given by Figure 8-4 and Figure 8-5 reveal the possibly of utilising Twitter data and rainfall data to predict floods in the early stages, therefore developing a flood prediction system by using Twitter data should be considered in future work.

Several studies have discussed flood detection from twitter, however few of them have developed a flood detection system by using social media data. A recent study (de Bruijn et al., 2018) developed a global multi-lingual system (Tag, 2018) to detect natural disasters, including floods, in real time. Their system analyses tweets written in 12 languages which include English, Indonesian, Filipino, French, German, Italian, Polish, Serbian, Spanish and Turkish. The research carried out by (Holderness and Turpin, 2015) involved developing a real time flood detection system (petabencana, 2018)by analysing tweets written in the English or Indonesian language. However, neither of those floods detection systems are applicable to the Arabic language, even though it is the sixth most used language on twitter and highest number of AUs on Twitter are from Saudi Arabia (Aslam, 2017). It can be stated that with the huge number of tweets written in Arabic, the proposed flood detection system will add a valuable contribution as it deals with Arabic text. Furthermore, it utilises rainfall data, which is not used in other flood detection systems.

A Pearson correlation coefficient (Benesty, 2009) has been calculated in order to assess the relationship between the actual amount of daily rainfall and the number of tweets which mentioned high risk flooding between the 5th of May and the 1st of June 2017, in Makkah, Saudi Arabia. A positive correlation between the two variables was found: r = 0.774, n = 28, p = 1.319e-06. In addition, significant evidence (p < 0.001) was found for an association between

the number of tweets and the level of daily rainfall. Figure 8-6 presents a scatterplot graph that illustrates a summary of the results. A strong, positive correlation was found between

143

the amount of daily rainfall and the number of tweets. In addition, correlations were found between increases in daily rainfall amounts and increases in tweets mentioning high risk floods (see Appendix c for details of rainfall and tweets data).

Figure 8-6 scatterplot shows the daily rainfall amount and number of tweets data.

8.6 Chapter summary

To summarise, this chapter has presented a real-time flood detection system and the results of its implementation. The architecture of the proposed detection system and its main components has been set out. The previous chapters have discussed and shown the need for flood detection by utilising Twitter users as sensors, especially Arabian users. In this chapter, as well as developing a flood detection system, a very good indicator of the possibility of utilising Twitter users as flood sensors to predict floods in the early stages has been shown. The proposed flood detection system has two unique features that do not exist in other floods detection systems, which are: (i) detect flooding events from Arabic tweets (ii) utilise rainfall data.

144