Friday, 14 July 2017

Book review: The SAGE Handbook of Social Media Research Methods

Charlotte Saunders is a Research Analyst at NatCen Social Research and the newest member of the NSMNSS team!
She works on quantitative and secondary analysis projects across a range of policy areas. Previously, Charlotte spent three years working for Ipsos MORI on a variety of qualitative and quantitative projects. Most recently she spent several months with VSO in Tanzania managing teams of young volunteers working in secondary schools. Charlotte holds an MSc in Public Health from the London School of Hygiene and Tropical Medicine. Her research project there looked at inequalities in access to clean water and sanitation in South America. 

In the past decade social media has transformed many aspects of our lives. It has revolutionised the way we communicate and the widespread adoption of mobile devices means its impact on everyday life continues to grow. As social researchers, we know that social media opens up huge new reserves of naturally occurring data for us to play with. The large volume of data produced is a goldmine for quantitative researchers; the opportunities provided by big data are well documented. But there are also new prospects in qualitative research, with new sources of ‘thick data’ which help us to understand the stories, emotions and worldviews behind the numbers.

Despite all these possibilities, many of us don’t know how to take advantage of them. There are lots of things to consider; from the technical knowledge required to access and store the data to the potential ethical issues involved in using people’s data for research without first asking their permission. The SAGE Handbook of Social Media Research Methods promises to guide researchers through the whole process - from research design and ethical issues, data collection and storage through to analysis and interpretation.

The editors, Luke Sloan (Senior Lecturer at Cardiff University and Deputy Director of the Social Data Science Lab) and Anabel Quan-Haase (Associate Professor of Information and Media Studies and Sociology at the University of Western Ontario) have compiled chapters which cover the whole research process. Discussions of the limitations of naturally occurring data from social media, and some of the techniques that can be used to overcome them are practical and guide the researcher through the issues clearly. The chapters outlining the history, structure and demographics of some less common social media platforms give a good basic overview for those who rarely stray away from Facebook or Twitter.

Overall this is a helpful guide to research using social media. The ethics discussions outline the key issues that researchers need to consider. There are also clear step-by-step guides which walk researchers through some of the technical processes needed to engage with social media sites. Simon Hegelich’s chapter “R for Social Media Analysis” is a good example; simple and easy to follow, Hegelich takes the reader through a simple project analysing and visualising data from Twitter using the free software programming language R.

Most of the book is well written and easy to follow although some chapters are less accessible and require significant existing knowledge. These chapters are likely to be valuable for experienced researchers looking to transfer their knowledge to a social media setting, but students and junior researchers may well find themselves scouring the internet for definitions and context.

Social Media Research Methods fulfils its aim to allow researchers to “apply and tailor the various methodologies to their own research questions”.  The step-by-step guides are logical and easy to follow and the case studies demonstrate how methods can be used in real research. For those with a good existing understanding of the research methodologies and techniques in their field this is an invaluable text opening up the social media research world. Those with less experience will probably need to refer to other resources to get up to speed with some chapters, but even then this is a useful addition to any social research library.


The SAGE Handbook of Social Media Research Methods is available to purchase here.

Friday, 2 June 2017

Q&A Session with Authors of The SAGE Handbook of Social Media Research Methods

Last Friday there was a launch event in San Diego at ICA (International Communication Association) for the new SAGE Handbook of Social Media Research Methods.

The editors Luke Sloan and Anabel Quan-Haase kindly responded to the questions that you submitted. If you missed the event, you can view the Q&A session here: https://storify.com/SAGE_Methods/q-a-with-anabel-quan-haase-and-luke-sloan

Let us know your thoughts by tweeting us @NSMNSS!

Friday, 19 May 2017

The SAGE Handbook of Social Media Research Methods – Questions for Authors

Have you read the recently published SAGE Handbook of Social Media Research Methods? It offers a step-by-step guide to overcoming the challenges inherent in research projects that deal with ‘big and broad data’, from the formulation of research questions to the interpretation of findings. The Handbook includes chapters on specific social media platforms such as Twitter, Sina Weibo and Instagram, as well as a series of critical chapters.

There is a launch event taking place on Friday 26th May in the US, at the Communication and Technology reception at ICA (San Diego), sponsored by SAGE. The editors Luke Sloan and Anabel Quan-Haase are happy to answer any questions you may have about the book, even if you aren’t able to attend in person – their responses will be posted throughout the day on Twitter via @SAGE_Methods.

If you have any questions about particular chapters, or to do with social media research methods generally, please tweet us your questions @NSMNSS using #SMRM or email Keeva.Rooney@natcen.ac.uk or  Franziska.Marcheselli@natcen.ac.uk  by Wednesday 24th May. 

We will pass your questions on and you can look out for the responses during the event!


Friday, 12 May 2017

Anti-Islamic Content on Twitter

This blogpost was written by Carl Miller, Research Director, and Josh Smith, Researcher, at the Centre for the Analysis of Social Media (CASM) at Demos. @carljackmiller

This analysis was presented at the Mayor of London’s Policing and Crime Summit on Monday 24 April, 2017.

The Centre for the Analysis of Social Media at Demos has been conducting research to measure the volume of messages on Twitter algorithmically considered to be derogatory towards Muslims over a year, from March 2016 to March 2017. This is part of a broad effort to understand the scale, scope and nature of uses of social media that are possibly socially problematic and damaging.

Over a year, Demos’ researchers detected 143,920 Tweets sent from the UK considered to be derogatory and anti-Islamic – this is about 393 a day. These Tweets were sent from over 47,000 different users, and fell into a number of different categories – from directed insults to broader political statements.

A random sample of hateful Tweets were manually classified into three broad categories:
  • ‘Insult’ (just under half): Tweets used an anti-Islamic slur in a derogatory way, often directed at a specific individual.
  • ‘Muslims are terrorists’(around one fifth) Derogatory statements generally associating Muslims and Islam with terrorism.
  • ‘Muslims are the enemy’ (just under two fifths): Statements claiming that Muslims, generally, are dedicated toward the cultural and social destruction of the West.
The researchers found that key events, especially terrorist attacks, drive large increases in the volume of messages on Twitter containing this kind of language.

The Brussels, Orlando, Nice, Normandy, Berlin and Quebec attacks all caused large increases. There was a period of heightened activity over Brexit, and sometimes online ‘Twitter storms’ (such as the use of derogatory slurs by Azealia Banks toward Zayn Malik) also drove sharp increases.

Tweets containing this language were sent from every region of the UK, but the most over-represented areas, compared to general Twitter activity, were London and the North-West.

Of the 143,920 Tweets containing this language and classified as being sent from within the UK, 69,674 (48%) contained sufficient information to be located within a broad area of the UK. To measure how many Tweets each region generally sends, a random baseline of 67 Million Tweets were collected over 19 days over late February and early March. The volume of Tweets containing derogatory language towards Muslims was compared to this baseline. This identified regions where the volume was higher or lower than the expectation on the basis of general activity on Twitter.

In London, North London sent markedly more tweets containing language considered derogatory towards Muslims than South London.

27,576 (39%) tweets were sent from Greater London. Of these, 14,953 Tweets (about half) could be located to a more specific region within London (called a ‘NUTS-3 region’; typically either a London Borough or a combination of a small number of London Boroughs).[1]
  • Brent, Redbridge and Waltham Forest sent the highest number of derogatory, anti-Islamic Tweets relative to their baseline average of general Twitter activity.
  • Westminster and Bromley sent the least number of derogatory, anti-Islamic Tweets relative to their baseline average of general Twitter activity.

Demos’ research identified six different online tribes. [2] These were:

Core political anti-Islam. The largest group of about 64,000 users including recipients of Tweets. Politically active group engaged with international politics.
  • Hashtags employed by this group suggest engagement in anti-Islam and right wing political conversations: (#maga #tcot #auspol #banIslam #stopIslam #rapefugees)
  • In aggregate, words in user descriptions emphasise nationality, right-wing political interest and hostility towards Islam (anti, Islam, Brexit, UKIP, proud, country)
Contested reactions to Terrorist attacks. The second largest group, of about 18,000 users, including recipients of tweets.
  • Aggregate overview of user descriptions imply a relatively young group (sc, snapchat, ig, instagram, 17,18,19,20, 21)
  • User descriptions also imply a mix of political opinion (blacklivesmatter, whitelivesmatter, freepalestine)
  • Hashtags engage in conversations emerging in the aftermath of terrorist attacks (#prayforlondon, #munich, #prayforitaly, #prayforistabul, #prayformadinah, #orlando)
  • Likewise, hashtags are a mix of pro- and anti-Islamic (#britainfirst, #whitelivesmatter, #stopislam, #postrefracism, #humanity)
The counter-speechers. A group of 8,700 people; although of course the data collection, by design, only detected the part of the counter-speech conversation containing language that can be used in a way derogatory towards Muslims. It is therefore likely that it did not collect the majority of counter-speech activity.[3]

The shape of the cluster shows a smaller number of highly responded to-/retweeted comments.
  • Hashtags engage predominantly with anti-racist conversations (#racisttrump, postrefracism, #refugeeswelcome, #racism, #islamophobia)
  • In aggregate, user descriptions show mix of political engagement and general identification with left-wing politics (politics, feminist, socialist, Labour).
  • Overall they also show more descriptions of employment than the other clusters (writer, author, journalist, artist).
The Football Fans. 7,530 users are in this cluster, including recipients of Tweets.
  • The bio descriptions of users within his cluster overwhelmingly contain football-related words (fan, football, fc, lfc, united, liverpool, arsenal, support, club, manchester, mufc, chelsea, manutd, westham)
  • No coherent use of hashtags. This cluster engaged in lots of different conversations.
India/Pakistan. Just under 5,000 users are in this cluster (including recipients).
  • Hashtags overwhelmingly engage in conversation to do with India-Pakistan relations or just Pakistan (#kashmir, #surgicalstrike, #pakistan, #actagainstpak).
  • In aggregate, words in user descriptions relate to Indian/nationalist identity and pro-Modi identification (proud, Indian, hindu, proud indian, nationalist, dharma, proud hindu, bhakt,)
The Gamers. 2,813 users are in this cluster (including Tweet recipients).
  • There is no coherent use of hashtags.
  • Overall, aggregate comments in user descriptions either imply young age (16,17,18) or are related to gaming (player, cod [for ‘Call of Duty’], psn)
A small number of accounts overall are responsible for many of the tweets containing language generally considered to be derogatory towards Muslims.
  • 50% of Tweets classified as containing language considered anti-Islamic and derogatory are sent by only 6% of accounts
  • 25% of Tweets classified as containing language considered anti-Islamic and derogatory were sent by 1% of accounts.
Likewise, a small number of accounts were the recipients of the derogatory, anti-Islamic activity that was directed at a particular person.

The full paper, outlining methodology and ethical notes, can be downloaded here. 



NOTES –
[1] An important caveat is that the volumes associated with each of these regions are obviously smaller than the total number of Tweets in the dataset overall
[2] A caveat here is that this network graph includes Tweets that are misclassified and also includes the recipients of abuse. It is also important to note that not everyone who shares Tweets does so with malicious intent; they can be doing so to highlight the abuse to their own followers.
[3] In other work on the subject we have found there are usually more posts about solidarity, support for Muslims than attacks on them.

Wednesday, 29 March 2017

Who uses Twitter?

Luke Sloan is a Senior Lecturer in Quantitative Methods and Deputy Director of the Social Data Science Lab at the School of Social Sciences, Cardiff University, UK. Luke has worked on a range of projects investigating the use of Twitter data for understanding social phenomena covering topics such as election prediction, tracking (mis)information propagation during food scares and ‘crime-sensing’. His research focuses on the development of demographic proxies for Twitter data to further understand who uses the platform and increase the utility of such data for the social sciences. He sits as an expert member on the Social Media Analytics Review and Information Group (SMARIG) which brings together academics and government agencies. @drlukesloan

Who uses Twitter?

It’s a simple question, but one that is tricky to answer. We all think we know the types of people who use Twitter – the urban elite, celebrities, professionals, young people… but providing an empirical account is challenging and without knowing who tweets we can’t even start a conversation about representativeness and bias. To understand how the social world manifests in the virtual we need to know who is present or underrepresented.

Much work has been done on using Twitter metadata to estimate proxy demographics for UK users such as gender (Sloan et al. 2013) and age, occupation and social class (Sloan et al. 2015), but these methods rely on people self-reporting a first name, an age or date of birth and an occupation to classify. The question has always been whether certain groups, such as older people and those from certain occupations, are less likely to choose to construct their virtual identity with reference to these characteristics or not.

Clearly it’s quite a leap forward to be able to use British Social Attitudes 2015, a random probability sample survey of over 4,000 respondents with weights calculated to account for non-response bias, to help us understand the Twitter population. The data allow us to compare Twitter usage by demographic groups benchmarked against the 2011 Census whilst evaluating previous attempts at demographic proxies.

So, how accurate is the picture of the demographic characteristics developed through proxies?

As it turns out we find some interesting discrepancies. According to the BSA data we find more men on Twitter than expected and we see that although most users are younger there are more older users on the platform than we previously thought. We also find that there are strong class effects regarding Twitter use, largely in line with previous proxy estimates most of the time but substantially out of line for certain groups. The full paper is open access and can be read here.

How does this aid our understanding of how the social world manifests online? To take an example, a recent study by Draper et al. found that, during the horsemeat food scare of 2013 Twitter was dominated by jokes and humour. The overall discourse suggested that this wasn’t perceived as a serious incident and that the issue wasn’t really a public concern, but we now know that Twitter is dominated by the higher NS-SEC groups – people with high incomes who are the least likely to come into contact with the budget adulterated products. Twitter thought it was funny because Twitter is dominated by people who were largely unaffected by the scare. This is an important lesson in how representation impacts upon what the data is telling us.

Of course, it’s no surprise that Twitter is dominated by the professional and managerial groups, but at least now we have some strong evidence to underwrite our expectations.



Read the full paper: Sloan, L. (2016) Who Tweets in the United Kingdom? Profiling the Twitter Population Using the British Social Attitudes Survey 2015, Social Media + Society 3:1, DOI: https://doi.org/10.1177/2056305117698981


Thursday, 23 February 2017

Programming as Social Science - new methods network

Phillip Brooker is a Research Associate at the University of Bath working in social media analytics, with a particular interest in the exploration of research methodologies to support the emerging field. His background is in sociology, drawing especially on ethnomethodology and conversation analysis, science and technology studies, computer-supported cooperative work and human-computer interaction. Phillip has previously contributed to the development of Chorus (www.chorusanalytics.co.uk), a Twitter data collection and visualisation suite. He currently works on CuRAtOR (Challenging online feaR And OtheRing), and interdisciplinary project focusing on how "cultures of fear" are propagated through online "othering".

Digital data and computational methods are increasingly becoming consolidated as essential elements of social science research and teaching. However, the algorithmic processes through which digital data are extracted, processed and visualised are often ‘black boxed’ and obscured from researchers who use those tools, which hinders our understanding of how they might be handled methodologically. Hence, there is an already-high and ever-increasing need for social scientists to engage with computational tools as a “critical technical practice” (Agre, 1997). In other words, since we are now pretty much completely reliant on software as part of our everyday research and teaching practices, it is all the more important that we were able to unpick and interrogate how these software packages operate, in order to better account for our data and research practices!

To this end, myself and Jonathan Gray (both at the University of Bath) have set up a mailing list/network called “Programming as Social Science (PaSS)”, for researchers interested in software programming both as an object of study and as a tool that we can learn and use within social science research. Here, we’re capitalising on lots of good work that has already been done in fields such as Science and Technology Studies, New Media Studies, Social Media Analytics, Software Studies, Ethnomethodology, Human-Computer Interaction, Computer-Supported Cooperative Work, and so on. All of these fields (and many more we haven’t listed!) have contributions to make in regard to understanding how we might critically leverage programming skills as part of social science teaching and research. So the PaSS mailing list/network has been established to act as a (low-traffic) hub for discussing these kinds of ideas, as well as sharing resources, updates, announcements and initiatives around programming in the context of social research.

If you’d like to join in, you can sign up via the following link: www.jiscmail.ac.uk/PaSS. Please feel free to invite anyone and share widely; the computer geek in me is very much looking forward to chatting about programming as part of my work!

Thursday, 16 February 2017

Visualising Facebook

Daniel Miller is Professor of Anthropology at University College London. Recent books include Social Media in an English Village (UCL Press 2016). Miller. et. al. How the World Changed Social Media (UCL Press 2016). With J. Sinanan Webcam (Polity 2014) Ed. With H. Horst, Digital Anthropology (Bloomsbury 2012). With M. Madianou Migration and New Media (Routledge 2012) Consumption and its Consequences (Polity 2012), with S. Woodward Blue Jeans (California 2012) Tales from Facebook (Polity 2011). He recently completed a volume about media in the social lives of patients with a terminal diagnosis, forthcoming as, The Comfort of People (Polity 2017). @DannyAnth

This March will see the publication of a new book called Visualising Facebook, which I have written with Jolynna Sinanan. It will be available as a free download from UCL Press. One of the key arguments from the larger Why We Post project, of which this book is one out of eleven volumes, is that human communication has fundamentally changed. Where previously it consisted almost entirely of either oral or textual forms, today, thanks to social media, it is equally visual. Think literally of Snapchat. So, it is a pity that when you look at the journals and most of the books about social media, they often contain either no, or precious few, actual visual illustrations from social media itself. One of the joys of digital publication is that it is possible to reproduce hundreds of images. So, our book is stuffed to the gills with photographs and memes taken directly from Facebook, which is, after all, our evidence.
For example, as academics, we might suggest that the way women respond to becoming new mothers in Trinidad, is entirely different from what you would find in England. In the book, we can reproduce examples from hundreds of cases, where it is apparent that when an English woman becomes a mother she, in effect, replaces herself on Facebook with images of her new infant. Indeed, these often become her own profile picture for quite some time. By contrast, one can see postings by new mothers in Trinidad, where they are clearly trying to show that they still look young and sexy or glamorous, precisely because they do not want people to feel that these attributes have been lost, merely because they are now new mothers.

In writing this book we examined over 20,000 images. These provide the evidence for many generalisations, such as that Trinidadians seem to care a good deal about what they are wearing when they post images of themselves on Facebook. While, by and large, English people do not. But this becomes much clearer when you can see the actual images themselves. Or we might suggest that English people are given to self-deprecating humour, while Trinidadians are not. Or that in England gender may create a highly repetitive association between males and generic beer, as against women with generic wine. In every case, you can now see exactly what we mean. We also have a long discussion about the importance of memes and why we call them `the moral police of the Internet’. How memes help to establish what people regard as good and bad values. This makes much more sense when you are examining typical memes with that question in your head.


To conclude, given the sheer proportion of social media posting that now consists of visual images, it would seem a real pity to look this gift horse in the mouth. Firstly, it has now become really quite simple to look at tens of thousands of such images in order to come to scholarly conclusions. But equally, it is now much easier to also include hundreds of such images in your publications to help readers have a much better sense of what exactly those conclusions mean and whether they agree with them.

Friday, 3 February 2017

Mine your Data – Why understanding online health communities matters

Originally posted on the NatCen blogsite on 10/11/16 
Aude Bicquelet is Research Director in the Health team. Prior to joining NatCen in 2016, she held a fellowship at the LSE (Department of Methodology) where she taught courses on Research Design, Mixed-Methods and Text Mining approaches. 
Aude specialises in the analysis of ‘Big Qualitative Data’ on health related issues and has worked with professional and regulatory health bodies such as the National Institute for Health and Care Excellence (NICE) and the Royal College of Physicians.  Methodological and substantive outputs of her research have been published in academic journals; she has also published a book on ‘Textual Analysis’ with Sage.
In addition to her interest in Health policies she is interested in Social and Political attitudes and has researched widely in the areas of political participation and √©lites’ attitudes towards the EU. 

A staggering 73% of adults in the UK turn to the internet when experiencing health problems. Whether it is to check symptoms, find out about available treatments or share experiences about living with a particular condition, the internet has become the first port of call with many turning to the web before they even consider going to see a doctor. While many of these conversations take place on health-related websites such as Patient or Netdoctor, people suffering from health conditions also share their experiences on social media – and health practitioners should take note.  
Earlier this week I presented findings from a recent study looking into how people use social media to discuss health issues at the ESRC Festival of Social Science. In this study, funded by the NCRM, we used text mining techniques to analyse comments about chronic pain posted under YouTube videos.  
We found that chronic pain sufferers use YouTube to describe their experiences and vent their frustration. We analysed over 700 YouTube comments, and found they can be sorted into one of five categories:
  • Sharing Experiences: commenters thank each other for sharing their experiences in the videos posted on the website, emphasising tolerance and empathy for chronic pain sufferers.
  • Expressing Frustration: chronic pain sufferers expressed their frustration in their own words. These illustrate how YouTube and other social media offer new avenues for communicating pain outside clinical contexts.
  • Coping with Pain: chronic pain sufferers used social media to share their daily practices to cope with chronic pain.
  • Alternative Therapy: commenters spoke openly about their use of alternative medicines, illegal drugs or alcohol to manage their pain. The often conflicting relationship with clinicians – who were perceived as over- or under-medicating – was also common in this category.
  • Risks and Concerns: they also discuss the risks associated with different types of medication – in particular, addiction and overdose - along with increased risks of depression associated with some treatments against pain.
The insights gained from social media research provide important substantive information for health practitioners. People communicate online in a way they don’t during interviews with researchers or during doctors’ appointments. Online forums and social media are rife with information that’s difficult to obtain through traditional research techniques where social desirability, fear of judgement or stigma, and wanting to be seen as ‘functioning well’ may influence what people are willing to say.  From a purely practical perspective, they also provide freely available naturally occurring data with access to (at times) to hard to reach groups.
Of course, there is a great deal of uncertainty around how to harness the opportunities of analysing the wealth of health information posted online in a representative, robust and ethical way.
Despite their usefulness and efficiency, analyses of Internet comments on health forums do raise a host of concerns such as representativeness – where the views of one cohort in a population having access, technical skills and inclination to post comments on Internet websites are over represented while the views of others are excluded (i.e. the so-called ‘digital divide’) and consent – where, online commentators may not expect to be research subjects.
Nevertheless, the explosion of Big Data and the popularity of online communities might precipitate the need to integrate social media analysis and health research in the near future. For instance, it has been shown that patients who visit their doctors with inappropriate or misinterpreted information from the internet will do little to enhance doctor–patient communication (see Ziebland 2004). But, doctor-patient communication could be improved simply if health professional themselves were better informed about the common fears and sometimes the common ‘myths’ disseminated on online health communities.

Watch Aude’s presentation from NatCen’s event ‘What Social Media Can Tell us about Society’, live from Twitter’s London HQ. This event was part of the ESRC Festival of Social Science
If you’re interested in how social media research can help you, please get in touch: aude.bicquelet@natcen.ac.uk or new-business@natcen.ac.uk

Monday, 16 January 2017

How Social Media Can Be a Researcher’s Miracle or Downfall

Cassie Phillips is a freelance technology writer who also dabbles in social media. She’s a firm believer that everyone can find a use for social media whether to make friends or conduct a research experiment. Like technology, she finds social media is just another tool to add to one’s arsenal. @securethoughtsc


The idea of combining social media and research at first might be at odds with one another, but they actually complement one another. Research involves the production, use and consumption of knowledge. Before social, scientists and researchers disseminated information via conferences, journals, peer reviews and publication. What brings all of these events together is collaboration. This is where the true benefits of social become apparent.

Finding Information
You probably already have a system in place to find journals and articles that will suit your research. This can include using information portals, attending meetings and even focusing on certain peer-reviewed journals. While still useful, this takes time and can also lead to information overload. Social media can help you find more relevant information and sift through the noise. Following researchers within your discipline can help you find articles and journals that may be particularly valuable to you.

On the flip side, you still need to verify the sources you find online. Anyone can publish an article or post on the internet, so it's more important than ever to check sources and make sure what you're reading is legitimate.

Knowledge Creation
Most researchers view data generation as the main aspect of the job. For the most part, this means finding other literature that supports your research. However, the other important aspect is ensuring you publish and disseminate the information at the right time. So where does social media fit especially when there are risks in communicating your research while it’s still going on? After all, it can reduce your chances of getting published while also providing ammunition against you should you make a mistake. And with social, there’s also the possibility your account might get hacked, especially if your internet connection isn’t secure, though luckily there are ways to protect yourself.

So what are the benefits? Consider this example. Marianne Hatzopoulou, a civil engineer professor, wanted to research the impact of air pollution on cyclists. She turned to Twitter and sent out a couple of tweets encouraging people to fill out a survey. A popular cycling blog found the survey and then wrote an article. This then got picked up by a local newspaper, which then led to coverage on a radio show and a major network.

Spreading the World


Perhaps the most attractive quality of social media is its ability to disseminate information. Above all, social media is about engagement and communication, making it ideal for researchers to reach a wider audience.

Of course, depending on the type of article you produce, you’ll reach a very particular audience. More scholarly and academic articles will likely attract other researchers in your discipline. However, this means you’ll likely alienate the layperson as they won’t want to wade through pages of information.

Since social media is such an effective tool in attracting attention, it might backfire. Always double check before publishing anything on social media to ensure the post doesn’t come off as offensive or in poor taste. Always read a post multiple times before hitting the send or publish button. One poorly worded tweet can reach thousands of people and ultimately lower your credibility not only among your followers but the scientific community as well.

While there are obvious benefits to using social media, it still doesn’t replace face-to-face interactions. If used improperly, it may end up hurting your reputation and research more than it helps. At the end of the day, it all depends on how you approach it and how you engage with your community.

Monday, 9 January 2017

“The Big Data rich and the Big Data poor”: the new digital divide raises questions about future academic research

Data is being created faster than ever before. However, as Kate Metzler explains, limited access to this big data is creating a digital divide between large companies and the broader scholarly community. To compound this problem, there is also a big data analysis skills gap that further hinders the progress of social science. Without access to these datasets or the expertise to analyse them, research is confronted with a replication crisis and is vulnerable to commercial motivations.
“Data is the new oil.” Clive Humby, mathematician and architect of Tesco’s Clubcard, is credited with saying this first in 2006, and it’s been repeated numerous times in the last decade. The comparison between data and oil refers to its value being extracted through refinement; or in the case of data, through analysis. Unlike oil, data is being created at a faster pace than it can be consumed, or analysed. We’re awash with data. You may have heard it said that “90% of all the data in the world has been generated over the last two years.” Or, as Hal R. Varian, Chief Economist at Google, puts it another way: “A billion hours ago, modern homo sapiens emerged. A billion minutes ago, Christianity began. A billion seconds ago, the IBM PC was released. A billion Google searches ago … was this morning.”
The capacity to collect and analyse massive datasets has already transformed fields such as biology, astronomy, and physics, and for many, the ‘big data revolution’ promises to ask, and help us answer,fundamental questions about individuals and collectives. But who gets access to all this data we’re producing through our increasingly networked and digital lives, and for what purpose?
divided
Image credit: Divided by David Wan. This work is licensed under a CC BY 2.0 license.
In 2012, danah boyd and Kate Crawford offered a provocation that the limited access to big data was creating a new digital divide between “the Big Data rich and the Big Data poor.” It’s only companies, and the social scientists working within these companies, that have access to really large social and transactional datasets. The broader scholarly community usually does not because companies refuse to release it or because purchasing it costs too much.
Recently, I conducted a survey of more than 9,000 social scientists to learn more about researchers who are engaged in research using big data and the challenges they face, as well as the barriers to entry for those looking to do this kind of research in the future. 32 per cent of respondents who are currently engaged in big data research reported that getting access to commercial or proprietary data was a “big problem” for them:
figure-1
Figure 1: Challenges facing big data researchers (n = 2273)
But it isn’t only the question of who can access data that leads to divides. As boyd and Crawford point out, and our survey supports, there is also a skills gap holding social science back: the level of quantitative and programming skills required for big data research make it a challenge for educators to introduce it into traditional social science degree courses as there is little time or expertise amongst teaching faculty:
figure-2
Figure 2: Challenges facing educators teaching big data (n = 1212)
Why does it matter?
So who cares if academic social scientists can’t do big data, either because they can’t access the data and/or don’t have the skills they need to engage with it? Why not just have companies like Twitter and Facebook analysing social media data? Some have even gone so as far as to argue that academics should not engage in research that can be done better by industry.
There are a couple of reasons why this is problematic. Firstly, because replication is the engine of science, and irreproducible research slows progress. If only researchers within companies can access and analyse big social datasets, “those without access can neither reproduce nor evaluate the methodological claims of those who have privileged access”.
And secondly, and arguably most importantly, the motivations of industry researchers and social scientists may differ in ways that may really matter. Big data research conducted by companies is usually in service of a single overarching goal: to sell you more stuff. Social scientists with the right skills and access to the right data may use their research to contribute to the body of knowledge, with the aim of better understanding and improving social outcomes.
The questions boyd and Crawford pose at the start of their paper summarize this perfectly. They ask:
“Will large-scale search data help us create better tools, services, and public goods? Or will it usher in a new wave of privacy incursions and invasive marketing? Will data analytics help us understand online communities and political movements? Or will it be used to track protesters and suppress speech? Will it transform how we study human communication and culture, or narrow the palette of research options and alter what ‘research’ means?”
As of yet, the answers to these important questions are unclear.
Read more in the recent SAGE Publishing white paper revealing full results of the survey, “Who is Doing Computational Social Science? Trends in Big Data Research.”
About the author
Katie Metzler is Head of Methods Innovation at SAGE Publishing. Katie is responsible for content strategy and innovation for SAGE’s award winning online platform for researchers, SAGE Research Methods, which includes SAGE Research Methods Cases, SAGE Research Methods Datasets and SAGE Research Methods Video. In addition to heading up the London commissioning team for the SAGE Research Methods platform, she is part of a new team at SAGE whose mission is to improve social science by equipping every researcher with the skills and tools they need to work effectively with big data and new technology. At SAGE, we believe big data and new technology are fundamentally changing how we make sense of the world and that social science needs to play a critical role where this impacts on society.