Kim Starr

Dr Kim Starr


Research Fellow in Translation and Multimodal Technologies
B.Sc (Econ), MA (Journalism), MA (Monolingual Subtitling and Audio Description), PhD, Graduate Certificate in Teaching and Learning.
+44 (0)1483 689960
14 LC 03

Academic and research departments

Literature and Languages.

About

University roles and responsibilities

  • Research Fellow

    News

    In the media

    MeMAD
    MeMAD

    Research

    Research interests

    Research projects

    Teaching

    Publications

    Highlights

    Starr, K. (2022) 鈥楢udio Description for the Non-Blind鈥, in Taylor, C. and Perego, E. (eds.) The Routledge Handbook of Audio Description. Abingdon: Routledge.

    Braun, S. and Starr, K. (2022) 鈥楢utomation in Audio Description鈥, in Taylor, C. and Perego, E. (eds.) The Routledge Handbook of Audio Description. Abingdon: Routledge.

    Braun, S. and Starr, K. (2020) Innovation in Audio Description Research. Abingdon: Routledge.

    Kim Starr (2015) 'Different Minds, Different Audiences'. IPCITI, 11th International Postgraduate Conference in Translation and Interpreting, University of Edinburgh.
    Kim Starr (2016) 'Audio Description for New Audiences'. Postgraduate Research Conference, 糖心Vlog
    Kim Starr (2017) 'Thinking Inside the Box': Audio Description for Cognitively Diverse Audiences. ARSAD Conference, Universitat Aut貌noma de Barcelona (UAB)

    According to Snyder, 鈥淎D is about democracy鈥 (2005: 16 in Mazur & Chmeil, 2012), yet audio description (AD) research and practice remains fundamentally focused on optimising accessibility to multimodal texts by reference to those affected by physical (visual) impairment. 

    Nevertheless, recent evidence suggests audiences requiring cognitive assistance, including individuals on the autism spectrum experiencing emotion-recognition difficulties (ERDs), may also benefit from supplementary audio description (Fellowes, 2012). 

    While previous studies have considered the AD of emotions, emotional lexicon and describing gestures and facial expression for visually impaired audiences (Ramos, 2015; Salway & Graham, 2003; Mazur & Chmeil, 2012), emotion-centric AD has not been considered in relation to audiences with emotion recognition difficulties.  

    This study employs a functionalist, skopos-based approach (Reiss and Vermeer, 1984 in Nord, 1997:29) to AD creation in order to examine the potential for extending the reach of audio description into the domain of supplemental cognitive narrative through the adoption of bespoke, dianoic (鈥榖etween minds鈥) translation strategies.  

    Consideration will be given to the results of an empirical study trialling prototype cognitive AD outputs alongside standard audio description with young autistic spectrum audiences who typically experience difficulty identifying emotions and 鈥榮tates of mind鈥 in others. For this purpose, three discrete AD orientations were created from audiovisual source texts:  

    (i) standard, blind and visually-impaired (BVI) AD, designed to be visually restorative; (ii) bespoke descriptive AD, created for audiences experiencing difficulty reading emotions and 鈥榮tates of mind鈥, and characterised by the identification and labelling of emotive-markers (EMO); and (iii), bespoke interpretative AD, created for audiences requiring additional assistance with assigning causality and consequence to emotions, and characterised by the contextualisation of emotive-markers (CXT). 

    Findings from the study are used to support the case for a fundamental reappraisal of AD as an accessibility tool. It will be argued that by developing dianoic translation strategies it should be possible to employ AD to enhance access to audiovisual materials for a range of audiences with atypical cognitive needs. To this end, examples of coincidence and divergence between standard BVI and cognitive EMO/CXT target texts will be given particular consideration for their potential to serve both types of audience simultaneously and independently.  

    The presentation will conclude with a brief discussion of the manner in which dianoic translation might be further developed to deliver competing AD channels in 鈥榤ultiplex鈥/ 鈥榬ed button鈥 television environments. 

    Braun, S. and Starr, K. (2018) From Slicing Bananas to Pluto the Dog: Human and Automatic Approaches to Visual Storytelling. Languages and the Media Conference, Berlin (3rd - 5th, October).

    This project will develop novel methods of analysing and describing audiovisual content based on a combination of computer vision techniques, human input and machine learning approaches to derive enhanced automatic descriptions. These descriptions will enable people working in the creative industries as well as people using their services to access, use and find audiovisual information in novel ways.

    Braun, S. and Starr, K. (2019) Mind the Gap: Omissions in AD ... and Why Machines Need Humans. ARSAD Conference, Barcelona, 19-20th, March.

    There is broad consensus that audio description (AD) is a modality of intersemiotic translation, but there are different views in relation to how AD can be more precisely conceptualised. While Benecke (2014) characterises AD as 鈥榩artial translation鈥, Braun (2016) hypothesises that what audio describers appear to 鈥榦mit鈥 from their descriptions can normally be inferred by the audience, drawing on narrative cues from dialogue, mise-en-sc猫ne, kinesis, music or sound effects. This presentation reports on a study that is testing these hypotheses empirically.

    Conducted as part of the EU-funded MeMAD project, our research aims to improve access to, and management of, audiovisual (AV) content through various methods, including by enhancing the automation of AV content description through a combination of approaches from computer vision, machine learning and human approaches to describing AV material. To this end, one of the MeMAD workstreams analyses how human audio descriptions approach the rendition of visually salient cues. We use a corpus of approx. 500 audio described film extracts to identify substantive visual elements, i.e. elements that can be considered essential for the construction of the filmic narrative, and analyse the corresponding audio descriptions in terms of how these elements are verbally represented. Where omissions in the audio description appear to occur, we conduct a qualitative analysis to establish whether the 鈥榦mitted鈥 elements can be inferred from the co-text of the AD and/or from other cues that are accessible to visually impaired audiences (e.g. the film dialogue). Where possible, we establish the most likely source of these inferences.

    In this presentation we outline the findings of the study and discuss their relevance, which we show to be twofold. Firstly, the study provides novel insights into a crucial aspect of AD practice and can inform approaches to training. Secondly, by highlighting how human audiences use their ability to draw inferences to build a coherent interpretation of what they perceive, the results of the study can also inform machine-based approaches to developing human-like descriptions of AV material.

    Braun, S. and Starr, K. (2019) 'Comparing Human and Automated Approaches to Video Description'
    Media For All 8 Conference, Stockholm, 17th - 19th June, 2019.

    The recent proliferation of (audio)visual content on the Internet, including increased user-generated content, intersects with European-wide legislative efforts to make (audio)visual content more accessible for diverse audiences. As far as access to visual content is concerned, audio description (AD) is an established method for making content accessible to audiences with visual impairment. However, AD is expensive to produce and its coverage remains limited. This applies particularly to the often ephemeral user-generated (audio)visual content on social media, but the Internet more broadly remains less accessible for people with sight loss, despite its high social relevance for people鈥檚 everyday lives.  

    Advances in computer vision, machine learning and AI have led to increasingly accurate automatic image description. Although currently focused on still images, attempts at automating moving image description have also begun to emerge (Huang et al. 2015, Rohrbach et al. 2017). One obvious question arising from these developments is how machine-generated descriptions compare with their human-made counterparts. Initial examination reveals stark differences between the two methods. A more immediate question is where human endeavour might prove most fruitful in the development of effective approaches to automating moving image description.

    This presentation reports on an initial study comparing human and machine-generated descriptions of moving images, aimed at identifying the key characteristics and patterns of each method. The study draws on corpus-based and discourse-based approaches to analyse, for example, lexical choices, focalisation and consistency of description. In particular, we will discuss human techniques and strategies which can inform and guide the automation of description. The broader aim of this work is to advance current understanding of multimodal content description and contribute to enhancing content description services and technologies.

    This presentation is supported by an EU H2020 grant (MeMAD: Methods for Managing Audiovisual Data: Combining Automatic Efficiency with Human Accuracy).

    Starr, K., Braun, S. and Delfani, J. (2020) 鈥楾aking a Cue from the Human: Linguistic and Visual Prompts for the Automatic Sequencing of Multimodal Narrative鈥. Journal of Audiovisual Translation, 3(2), pp. 140-169. https://doi.org/10.47476/jat.v3i2.2020.138
    Braun, S., Starr, K. and Laaksonen, J. (2020) 鈥楥omparing Human and Automated Approaches to Visual Storytelling鈥, in Braun, S. and Starr, K. (eds) Innovation in Audio Description Research. Abingdon: Routledge, pp. 159-196.
    Braun, S. and Starr, K. (2019) 鈥楩inding the Right Words: Investigating Machine-Generated Video Description Quality Using a Human-derived Corpus-based Approach鈥. Journal of Audiovisual Translation, 2(2), pp. 11-25. https://doi.org/10.47476/jat.v2i2.103
    Braun, S. and Starr, K. (2021) 鈥楤yte-Sized Storytelling: Training the Machine to See the Bigger Picture鈥. Languages and The Media, Berlin, 20-23rd September.
    Starr, K. (2021) 鈥楧o You See What I See? Addressing the Practical and Ethical Issues of Using Audio Description as a Cognitively Oriented Accessibility Service鈥. IATIS, Barcelona, 14-17th September.
    Braun, S., Starr, K., Delfani, J., Tiittula, L., Laaksonen, J., Braeckman, K., Van Rijsselbergen, D., Lagrilli猫re, S. and Saarikoski, L. (2021) 鈥榃hen Worlds Collide: AI-created, Human-mediated Video Description Services and the User-Experience鈥. UAHCI, Washington DC/online, 24-29th July.
    Starr, K. and Braun, S. (2020) 鈥楢udio Description 2.0: Re-versioning audiovisual accessibility to assist emotion recognition鈥, in Braun, S. and Starr, K. (eds) Innovation in Audio Description Research. Abingdon: Routledge, pp. 97-120.
    Starr, K., Braun. S. and Delfani, J. (2021) 鈥楾he Sentient Being鈥檚 Guide to Automatic Video Description: a Six-Point Roadmap for Building the Computer Model of the Future鈥. Media for All 9, Barcelona/online, 27-29th January.
    Braun, S. and Starr, K. (2020) 鈥楳apping New Horizons in Audio Description Research鈥, in Braun, S. and Starr, K. (eds) Innovation in Audio Description Research. Abingdon: Routledge, pp. 1-12.
    Braun, S. and Starr.K (eds) (2020) Innovation in Audio Description Research. Abingdon: Routledge.