Overview of multimodal literacy

A multimodal text conveys meaning through a combination of two or more modes, for example, a poster conveys meaning through a combination of written language, still image, and spatial design. Each mode uses unique semiotic resources to create meaning (Kress, 2010). .  Each mode has its own specific task and function (Kress, 2010, p. 28) in the meaning making process, and usually carries only a part of the message in a multimodal text. In a picture book, the print and the image both contribute to the overall telling of the story but do so in different ways. In a visual text, for example, representation of people, objects, and places can be conveyed using choices of visual semiotic resources such as line, shape, size, line and symbols, while written language would convey this meaning through sentences using noun groups and adjectives (Callow, 2023) which are written or typed on paper or a screen.

Images may simply illustrate or expand on the written story, or can be used to tell different aspects of the story, even contradicting the written words (Guijarro and Sanz, 2009, p. 107).

Effective multimodal authors creatively integrate modes in various configurations to coherently convey the meaning required, ‘moving the emphasis backwards and forwards between the various modes' (Cope and Kalantzis, 2009. p. 423) throughout the text.

The complexity of the relationships between the various meaning or semiotic systems in a text increases proportionately with the number of modes involved. For example, a film text is a more complex multimodal text ​​than a poster as it dynamically combines the semiotic systems of moving image, audio, spoken language, written language, space, and gesture (acting) to convey meaning. 

EAL/D learners and multilingual multimodal texts

Multimodal texts incorporating English and EAL/D learners’ home languages can be a source of authentic model texts. For EAL/D learners and other students who are plurilingual, it is important to consider how English and home languages can be integrated through different modes.  

These include:  

  • videos with speech in one language and subtitles in another
  • bilingual picture books with text in multiple languages
  • instruction manuals with information presented in pictures and translated into multiple languages
  • restaurant menus with images, dish names and descriptions in different languages.

Multilingual multimodal texts demonstrate clearly the author’s word and visual choices. Examples of discussion questions around the multimodal text could be:  

  • Which languages are used in the text and why?
  • Why did the author choose these languages and not others?
  • What function does each language have?
  • How would the text be different if it were only in one language/English?
  • How do you read/understand texts that include languages you don't know?
  • What are the advantages of watching a film with subtitles versus dubbing?
  • Why are certain words in the novel written in another language, and italicised?


The foll​​owing overview of how meaning can be composed through different semiotic resources for each mode (spoken language, written language, visual, audio, gestural, and spatial) is informed by The New London Group (2000), Cope and Kalantzis, (2009), and Kalantzis, Cope, Chan, and Dalley-Trim (2016). EAL/D learners engage with all of these meaning making practices through multicultural and/or multilingual lens.

Currently, there is extensive pedagogic support for teaching meaning making through spoken and written language, and some resources developed to support teaching meaning making in the visual mode, through ‘viewing’. However, as yet there are few resources available for teaching young students how to comprehend and compose meaning in the other modes. 

Written​​ meaning

Conveyed through written language via handwriting, the printed page, and the screen. Choices of words, phrases, and sentences are organised through linguistic grammar convention​​​​s, register (where language is varied according to context), and genre (knowledge of how a text type is organised and staged to meet a specific purpose). See: Writing and Reading and Viewing

In bilingual or multilingual texts, written meaning may be conveyed through different scripts and laid out differently, whether typed or handwritten. EAL/D learners may also write words from their home languages using English letters (transliteration).

Spoken (​​oral) meaning

Conveyed through spoken language via live or recorded speech and can be monologic or ​​dialogic. Choice of words, phrases, and sentences are organised through linguistic grammar conventions, register, and genre. Composing oral meaning includes choices around mood, emotion, emphasis, fluency, speed, volume, tempo, pitch, rhythm, pronunciation, intonation, and dialect. EAL/D learners may make additional choices around the use of home languages to create mood or emphasise meaning. See: Speaking and listening pedagogic resources​.

Visual ​​meanin​​g

Conveyed through choices of visual resources and includes both still image and moving images. Images may include diverse cultural connotations, symbolism and portray different people, cultures and practices. Visual resources include: framing, vectors, symbols, perspective, gaze, point of view, colour, texture, line, shape, casting, saliency, distance, angles, form, power, involvement/detachment, contrast, lighting, naturalistic/non-naturalistic, camera movement, and subject movement. See Visual literacy metalanguage.

Audio me​​aning

Conveyed through sound, including choices of music representing different cultures, ambient sounds, noises, alerts, sil​​ence, natural/unnatural sounds, and use of volume, beat, tempo, pitch, and rhythm. Lyrics in a song may also include multiple languages.

Spatial meanin​​g

Conveyed through design of spaces, using choices of spatial resources including: scale, proximity, boundaries, direction, layout, and organisation of objects in the space. Space ​extends from design of the page in a book, a page in a graphic novel or comic, a webpage on the screen, framing of shots in moving image, to the design of a room, architecture, streetscapes, and landscapes.

Gestural meaning

Conveyed through choices of body movement; facial expression, eye movements​ and gaze, demeanour, gait, dance, acting, action sequences. It also includes use of rhythm, speed, stillness and angles, including ‘timing, frequency, ceremony and ritual’ (Cope and Kalantzis, 2009. p. 362). Gestures and body language may have diverse cultural connotations.

Types of multim​​odal texts

Multimodality do​​es not necessarily mean use of technology, and multimodal texts can be paper-based, live, or digital. 

Paper-​​based multimodal texts include picture books, text books, graphic novels, comics, and posters.

Live ​​multimodal texts, for example, dance, performance, and oral storytelling, convey meaning through combinations of various modes such as gestural, spatial, audio, and oral language. 

Digital multimodal texts include film, animation, slide shows, e-posters, digital stories, podcasts, and we​​b pages that may include hyperlinks to external pronunciation guides or translations.

Why teac​​hing multimodal literacy is important

Effecti​​ve contemporary communication requires young people to be able to comprehend, respond to, and compose meaning through multimodal texts in diverse forms.

To do this, students need to know how each mode uses unique semiotic resources to convey meaning (Kress, 2010) and this needs to be taught explicitly. In a visual text, for example, representation of people, objects, and places can be conveyed using choices of visual semiotic resources such as line, shape, size, line and symbols, while written language would convey this meaning through sentences using noun groups and adjectives (Callow, 2023) written or typed on paper or a screen. 

Students also need to be taught how authors juggle the different modes to determine the most appropriate way t​​o tell their story, and how meaning in a multimodal text is ‘orchestrated’ through the selection and use of different modes in various combinations (Jewitt, 2009. p.15).

How multimodal texts support EAL/D learners to understand and communicate complex ideas

Multimodal texts allow a broader range of communication options for EAL/D learners that do not rely solely on traditional spoken and print based texts. Reading, viewing and creating multimodal texts provides EAL/D learners with additional ways to understand and communicate complex ideas despite a language barrier, therefore ensuring they are provided with a more equitable access to learning and communication (Walsh et al. 2015). For example, EAL/D learners with very low English proficiency can still use gestures and draw pictures to communicate their meaning. With support, they can view and understand new and abstract concepts through a digital text that can then be associated with new language.   

EAL/D learners engage in multimodal texts in their everyday lives. Research demonstrates that students learn best when school practices reflect familiar home and community practices (Gutiérrez, Baquedabo-Lopez & Turner, 1997, Gutiérrez, 2008). Most EAL/D learners can confidently create content using familiar digital tools. This can then be used as a platform to expand their use of language in combination with other modes of communication (Walsh et al. 2015).   

To teach multimodal literacy, the teacher selects model multimodal texts that are appropriate to the purpose of a task or lesson. The teacher explicitly scaffolds how language combines with paper, live and digital multimedia platforms to communicate effectively.   


Modes and mean​ing making: three sub-strands

Students need to understand how authors can control and use the unique semiotic resources available in​​​ each different mode used in a multimodal text. Currently, the Victorian Curriculum organises teaching about language around three types of meaning organised as sub-stands: Expressing and developing ideas; Language for interaction; and Text structure and organisation. Similarly, teaching meaning making in other modes can be approached through three sub-strands.

Expressing and developing i​​deas

What is happening in the text? Students learn how the different m​​​​eaning making resources can be used to: construct the nature of the events, the objects and participants involved, and the setting and circumstances in which they occur – who, what, where and when, and to express actions and ideas.

Interactin​​g and relating with ​​others

How do we interact with and relate to others? How do we feel? Students learn how design of interactive meaning in a multimodal text includes consideration of the social setting, how interactions between the viewer/reader/listener and the subject can be established, and how to build and maintain relationships. Students need to learn how to express knowledge, skills, feelings, attitudes and opinions, credibility, and power through different modes. 

Text structure and ​​organisation

How do design and layout build meaning and guide the reader/viewer/listener through the text? Students learn how different modes are used to structure ​​a text in a particular way to create cohesive and coherent texts, with varying levels of complexity. For example, students learn how the image maker guides the viewer through the text through the deliberate choices of visual design at the level of the whole text, and components within the text. In examining how the image or text is organised, students learn how visual design choices can prioritise some meanings and background others (Painter, Martin, & Unsworth, 2013). 

(For fu​​rther information, see Anstey and Bull, 2009; Callow, 2013; Cloonan, 2011, Kalantzis, Cope, Chan, and Dalley-Trim, 2016.) 

Modes and meaning making for EAL/D learners: strands and sub-strands

The Victorian Curriculum F-10 EAL organises the strands and sub-strands for each language mode (Speaking and listening, Reading and viewing and Writing) differently from the English curriculum. The three strands in the EAL curriculum are Communication; Cultural and plurilingual awareness; and Linguistic structures and features. These strands are divided further into sub-strands. For example, Cultural and plurilingual awareness contains two sub-strands: Cultural understandings and Plurilingual strategies.  


How can teachers support their EAL/D students to make meaning clear to their audience? EAL/D students learn to understand, analyse and produce a range of text types. To create different multimodal texts, the teacher scaffolds EAL/D learners to consider:  

  • how they can use their languages in posters/videos/digital stories to communicate most effectively?
  • what messages should they convey?
  • the key features or sections of text that should be included
  • how are visuals or audio used to help the audience’s understanding?
  • what languages can the target audience use to understand the text?

Plurilingual strategies

How do teachers support the EAL/D learners to use plurilingual strategies? Teachers should teach the EAL/D students about how the different cultural conventions of text design and layout enhance the message of the text. They also teach the processes of planning, producing and revising and provide worked examples to show how this can help create a high quality final text. At each stage, students consider how they can enhance the text by incorporating their knowledge of English, home languages and cultural knowledge. For example, EAL/D learners preparing a program for a school production might consider whether there should be multiple versions of the program in different languages or if only some aspects of the program needed to be translated (e.g. the synopsis).  

Linguistic structures and features

In the EAL curriculum, the Linguistic structures and features strand encompasses the Text structure and organisation sub-strand. EAL/D students learn to control language at the word, sentence and whole text level. With support students learn to choose language appropriate for the topic. As they became more proficient, they will be able to make choices about expressing that language in more spoken-like or more written-like ways to suit their text type. Students also consider their relationship with the audience and how this can be communicated through language choices including through appropriate and accurate ways of expressing their meaning through language. At all stages of this process, EAL/D students will require scaffolded support to develop an awareness and understanding of the impact of their language choices.  



Anstey, M., & B​​ull, G. (2009). Using multimodal texts and digital resources in a multiliterate classroom. In e:lit (Vol. 004, pp. 1-8). Sydney: Primary English Teaching Association.

Callow, J. (20​​​23). The Shape of Text to Come: How Image and Text Work(2nd ed.). Sydney: Primary English Teaching Association of Australia.

Cloonan, A. (2011). Creating multimodal metalanguage with teachers. English Teaching, 10(4), 23.

Cope​​, B., & Kalantzis, M. (2009). A grammar of multimodality. The International Journal of Learning, 16(2), 361-423.

Guijarro, J​​. M., & Sanz, M.J. (2009) On interaction of image and verbal text in a picture book. A Multimodal and Systemic Functional Study. In E. Ventola & J. M Guijarro (Eds), The World Told and the World Shown: Multisemiotic Issues (pp. 107-123). Palgrave Macmillan.

Jewitt, C. (ed.) (2011) The Routledge Handbook of Multimodal Analysis, London: Routledge.

Kala​​ntzis, M., Cope, B., Chan, E., & Dalley-Trim, L. (2016). Literacies (2nd ed.). Port Melbourne, VIC, Australia: Cambridge University Press.

Kress, G. (2010). Multimodality: a social semiotic approach to contemporary communication. London; New York: Routl​​edge.

Painter, C., Martin, J. R., & Unsworth, L. (2013). Reading Visual Narratives: image analysis of children​​​'s picture books: Equinox Publishing Limited 

The New London G​​roup. (2000). A pedagogy of Multiliteracies designing social futures. In B. Cope & M. Kalantzis (Eds.), Multiliteracies: Literacy Learning and the Design of Social Futures (pp. 9-38). South Yarra: MacMillan.

Walsh, M., Durrant, C., & Simpson, A. (2015). Moving in a Multimodal Landscape: Examining 21st Century Pedagogy for Multicultural and Multilingual Students. English in Australia, 50(1), 67-76.