Friday, 23 December 2016

Identifying popular tourism attractions in London by using geo-tagged photos from Flickr

Dr. Yeran Sun is a postdoctoral researcher at Urban Big Data Centre, University of Glasgow, UK. His research interests include big data and urban studies, social media research and sentiment analysis, transport and social inequality, transport and public health.

Social media data offers crowd-sourced data to social science research. In particular, GPS enable-devices, such as smart phones, allow social media users to share their real-time locations in social media platforms.

In my presentation, Flickr geo-tagged photos are used to identify popular tour attractions in London.

‘Geo-tagged’ photos and tweets of Flickr, Instagram and Twitter users tell us the footprints and mobility of users. Compared to Instagram and Twitter, Flickr has a large portion of tourists. Geo-tagged photos from Flickr users are used as crowed-sourced data in recent tourism research. However, the population of geo-tagged photos are not proportional to the population of real tourists’ footprints. Therefore, visits to popular tourism attractions such as landmarks are likely to be over-represented by Flickr photos, while visits to unpopular tourism attractions are likely to be under-represented.

Although geo-tagged photos are biased, they could be used to reflect popularities of tourism attractions that have no ticketing records, such as central squares, public statues, public parks, rivers, mountains, bridges and so forth. Crowd-sourced data from Flickr photos can be used to measure popularities of tourism attractions without ticketing records. As clusters of photos tend to take place around popular tour attractions where tourists like to take photos, we could identify popular attractions by detecting significant spatial clusters of geo-tagged photos.

In my presentation, significant clusters are detected by using a density-based clustering method called DBSCAN.  Most of those clusters spatially overlap popular tour attractions in London. In my presentation, free-to-use tools QGIS and R are used to map geo-tagged photos and carry out cluster detection respectively. Additionally, to run the DBSCAN algorithm we need to install a package ‘dbscan’ in R. Via Flickr APIs (https://www.flickr.com/services/api/), we can download public Flickr data including photos, tags and coordinates by defining geographic boundaries or searching for keywords.  There are API kits written in a variety of languages, including C, Delphi, Java, Python, PHP, .NET, Ruby and so on. You might also use shared Flickr data for your research. Yahoo Research share Flickr data with researchers (https://research.yahoo.com). Shared datasets can be found here (https://webscope.sandbox.yahoo.com/catalog.php?datatype=i). 

2 comments: