Multimodal social data to study human behaviour

On 23 October 2019 Dr Scott Hale was the speaker.

Dr Hale explained how we generate unprecedented quantities of data through our online social interactions that, through new computational approaches, enable social science research at scale. Most communication online involves image or videos in addition to text. Dr Hale showed how joint consideration of text and image data from social media profiles can lead to more accurate estimations of age, gender, and organization-status of users than text or image data alone. The multimodal deep neural architecture created operates in 32 languages and substantially outperforms current state of the art while also reducing algorithmic bias. Its output can be used to help correct for the non-representative nature of online data. Dr Hale also discussed his research analysing semantic change on social media using word embeddings: dense vector representations of words learnt from large volumes of text. The results show some of the "words" undergoing the greatest meaning change in the last five years on Twitter are emoji. Finally, Scott briefly discussed his research tracking the spread of memes using image hashing and analysing the similarity of television news across languages.