Creating multimodal texts

Multimodal texts combine two or more modes such as written language, spoken language, visual (still and moving image), audio, gestural, and spatial meaning (The New London Group, 2000; Cope and Kalantzis, 2009). Creating digital multimodal texts involves the use of communication technologies, however, multimodal texts can also be paper-based or live performances.

The Victorian Curriculum recognises that students need to be able to create a range of increasingly complex and sophisticated spoken, written, and multimodal texts for different purposes and audiences, with accuracy, fluency and purpose.

Why teaching creating multimodal texts is important

Creating multimodal texts is an increasingly common practice in contemporary classrooms. Easy to produce multimodal texts including posters, storyboards, oral presentations, picture books, brochures, slide shows (PowerPoint), blogs, and podcasts. More complex digital multimodal text productions include web pages, digital stories, interactive stories, animation, and film.

Student authors need to be able to effectively create multimodal texts for different purposes and audiences, with accuracy, fluency, and imagination. To do this, students need to know how meaning is conveyed through the various modes used in the text, as well as how multiple modes work together in different ways to convey the story or the information to be communicated.

Students need to know how to creatively and purposefully choose how different modes might convey particular meaning at different times in their texts, and how to manipulate the various combinations of different modes across the whole text to best tell their story (Jewitt, 2009). See: Modes.

Why teaching EAL/D learners to create multilingual multimodal texts is important

Multimodal texts containing elements of other languages support EAL/D students to engage and achieve at school. They use the language and social abilities that they develop outside of school in classroom communication and tasks. These include translating, combining more than one language to communicate and learn, and using diverse linguistic and cultural practices when they communicate.  

When EAL/D students use all their language abilities in a learning task, they make connections between existing and new knowledge. It enhances engagement and affirms their identities as learners who can integrate their knowledge of multiple languages to communicate, learn a new language and learn a new language. 

EAL/D students learn to think critically about the purpose and function of each language they use in a multimodal text. Using the teaching and learning cycle, the teacher explicitly teaches the language and text structures that students need to complete these tasks. 

EAL/D student authors who can use English and their language to create texts for multilingual audiences can, with support, choose how different multimedia modes and different languages combine in a text. It allows students to make creative and purposeful decisions about how to communicate effectively to particular audiences.  

The choice to include elements of other languages in a text is an overt and concrete means by which students can develop their skills as text analysts. They detect and analyse underlying values, beliefs, views, and discern reader/viewer position within the text.  

Support students to analyse a text by asking questions such as:  

  • Why include more than one language?
  • Who is included/excluded?
  • What information should be contained in English and the other language? Do they need to be the same?
  • How might a monolingual English speaker view the text? In what ways would it be different from a bilingual speaker or non-speaker of English?

For more information about text analyst, see: The four resources model for reading and viewing  


Examples of texts to create

Below are examples of different forms of texts students might create in the classroom. The complexity of creating texts increases proportionately with the number of modes involved and the relationships between the various semiotic, or meaning-making, systems in a text, as well as the use of more complicated digital technologies.

Simple multimodal texts include comics/graphic novels, picture books, newspapers, brochures, print advertisements, posters, storyboards, digital slide presentations (e.g. PowerPoint), e-posters, e-books, and social media.

Meaning is conveyed to the reader through varying combinations of written language, visual, gestural, and spatial modes.

Podcasts are also simple to produce, involving combinations of spoken language, and audio modes.

Live multimodal texts include dance, performance, oral storytelling, and presentations. Meaning is conveyed through combinations of various modes such as gestural, spatial, audio, and oral language.

Simple multimodal texts and EAL/D learners

EAL/D learners can be supported to understand and create simple multimodal texts that reflect the diversity in languages and cultures within the school. For example:  

  • creating posters, newsletters, brochures or blogs with sections translated into home languages, or headings, captions and diagrams labelled in English and home languages. Students can also add glossaries or translations of key terms
  • creating comics with captions and speech bubbles written in English and home language, as appropriate for the purpose and audience. This could include different scripts in illustrated scenes and ‘sound effects. Particular characters may also speak a combination of English and another language or dialect
  • creating translations of popular picture books, their own or their classmates’ stories to contribute to the classroom library, making sure that meaning is not lost in translation. This could include using metaphors in their home languages that approximate the meaning in the English text
  • creating slideshows that include translated vocabulary, explanations or pronunciation guides
  • creating content for social media. EAL/D learners could be typing in different scripts or transliterating the sounds of their language using English script in social media. Social media users create and access videos, music, stories and memes in a range of languages
  • creating multilingual resources for the school community including signage, welcome packs, teaching and learning resources.

Students can also be supported to create live multimodal texts that reflect the diversity in languages and cultures within the school. For example, to create live multimodal texts, students:  

  • use music and gestures from different cultural dance traditions in dance performances
  • create translations to accompany school plays, for example, subtitles in English and/or another language and bilingual glossaries in the program
  • tell a story from their home culture in English, or retell a familiar English story in their home language.

Complex digital multimodal texts include live-action films, animations, digital stories, web pages, book trailers, documentaries, music videos. Meaning is conveyed through dynamic combinations of various modes across written and spoken language, visual (still and moving image), audio, gesture (acting), and spatial semiotic resources. Producing these texts also requires skills with more sophisticated digital communication technologies.

Complex digital multimodal texts and EAL/D learners

EAL/D learners can incorporate multiple languages into complex digital multimodal texts by: 

  • writing the subtitles in English or a different language for films, animations, digital stories or documentaries.  Support students choose the most appropriate language for speech and subtitles, depending on their audience. Visual effects and images can be used to add text in multiple languages for emphasis or explanation.
  • including hyperlinks and mouse-overs are an excellent way for students to provide translations, pronunciation of key terms or a glossary in web pages. Different sections of text can be written in different languages with translations into English, and multilingual audio or video clips may be incorporated. Students can also learn purposeful ways of incorporating computer translation tools into web pages they create
  • incorporating English and home languages into their music videos and song lyrics. These may be accompanied by text or subtitles, or use visual effects to emphasise words or phrases in different languages
  • creating original films, animations and digital stories using voiceover, with or without subtitles.

What teachers and students need to know

The skilled multimodal composition requires students to know the subject or field of the text, textual knowledge of how to best convey meaning through the text; digital multimodal authoring also requires knowledge of the technology and of the processes required to produce innovative digital media productions (Mills, 2010).

Textual knowledge encompasses both semiotic knowledge and genre. Semiotic knowledge concerns how each mode conveys meaning in different ways in the text, where each mode has its specific task and function (Kress, 2010, p. 28) in the meaning-making process.

Multimodal authors also need to be able to imaginatively combine different modes in various strategic arrangements throughout the text, for example, print and visual semiotic resources in a picture book, to effectively and creatively convey the meaning required.

Genre concerns knowledge of the social functions and contexts in which a text is produced and used, and how the text is organised and staged to meet a specific social purpose (Martin, 2008). Like writing, the successful multimodal composition includes consideration of purpose, audience and text type (for example, to entertain, inform, or persuade). 

Technological knowledge concerns knowledge of the technical content as well as of the processes required to produce innovative digital media productions, including knowledge of the machines involved and the media applications (Mills, 2010, p. 224).

Effectively teaching students how to create multimodal texts requires new and diverse literacy skills and semiotic knowledge which, by necessity, extend beyond the realms of traditional print-based literacy into other learning disciplines. 

Literacy teachers need to draw on expertise and knowledge and skills from other disciplines, to support the development of new literacy competencies. This includes essential aspects from The Arts – music, media, drama, film, and art; and from Information Communication Technologies (ICT).

To create multilingual multimodal texts that strategically include elements of EAL/D students’ home languages, students also need to know both the English and the home language (or additional language) features that they want to publish in. This linguistic knowledge does not necessarily have to be comprehensive or formal, but rather appropriate for the purpose and audience of the text. Students working in groups may know different aspects of the language.

Teaching creating multimodal texts: production stages

Teaching creating multimodal texts is based on teaching writing, extended to teaching students how to produce short, purposeful, and engaging texts in different forms and media formats.

Students need to develop increasing control over the different semiotic contributions of each of the modes deployed, and at the same time, attend to creatively combining modes into a meaningful whole (Hull, 2005, p.234). In addition, pedagogic attention to any technical requirements is also essential.

Teaching creating multimodal texts can be structured in stages around the film production approach. This includes pre-production, production, and post-production.


The pre-production stage includes consideration of the topic, the purpose, the audience and the context. The story/content is drafted and organised, and manageable boundaries are established. This includes setting limits to several pages in a picture book, or slides in a PowerPoint or time limits for digital productions – 30 to 90 seconds is long enough for novice podcasts, film or animation productions.

The production process is planned. This might include writing a story outline that provides brief information about who, what, where, and when; a script that includes information about the text participants (characters or subjects), dialogue, action, sound effects, and music; and preparing a storyboard to scope the visual design of the text – what is to be shown and how it will be seen. (See Visual metalanguage for more information.)

Image 1: Storyboard example (Creative Commons BY-ND 4.0)

For EAL/D students to produce multilingual multimodal texts, they might engage in the pre-production stage using their strongest language to achieve depth in their ideas. This may mean students plan a multimodal text using a storyboard with descriptions in their home language. They can then discuss and refine their ideas with the teacher or other students using English.

If students create multimodal texts that include home languages, they may work with the same language peer, bilingual staff member or parent to check and edit work that will be published. However, the EAL/D student must assume responsibility for discussing and reporting their work in English with peers and the teacher.

The production stage

The production stage is where the text is composed or produced. Production can be a simple process using familiar tools and resources or can involve learning to use more complex digital tools including cameras, recording equipment, or digital applications and software.

Complex media production processes can be simplified for the literacy classroom. For example, a simplified approach to creating live-action films involves an ‘in-camera’ edit. This requires the whole sequence to be carefully planned first. 

Beginning with the title shot, the film is shot in sequence, shot by shot, pausing the camera between shots. Sound effects and additional information must be recorded at the same time as the action. Following the final shot, the film is finished, and there is no further editing or post-production. The same approach can be used recording simple podcasts, as an ‘in-microphone’ edit.

In contrast, a conventional approach to filmmaking/podcast production involves filming or recording the content in segments first and then putting the final text together through post-production.

The teacher may need to explicitly teach EAL/D students the use of the equipment and technical skills needed to capture and create digital multimodal texts. The teacher may provide reference materials with annotated visuals to support students in learning the technical language associated with production skills.

Post-production stage

In the post-production stage filmed shots or recorded audio segments, are edited using a digital editing program to remove sections, order information, and add in introductions, titles, music, visual and sound effects.

The teacher explicitly teaches EAL/D students the technical skills needed to edit and manipulate multimodal texts. In addition to the general editing skills, the teacher may need to find a 'knowledgeable other' to teach students specific multilingual skills such as typing in different scripts or using translation apps.

For more information on EAL/D teaching strategies that support students to produce language and content for their multimodal texts, see Writing Process.

Using the teaching and learning cycle for creating multimodal texts

The teaching and learning cycle (TLC) initially developed for teaching writing and reading provides a logical, systematic process for teaching creating multimodal texts (Zammit, 2015; 2014; Chandler, O’Brien and Unsworth, 2010).

This approach supports teaching students how to successfully create a range of different texts for different purposes and audiences, which communicate the author’s meaning (Miller, 2010, p.214) through attention to meaning design in the different modes deployed.

The teaching and learning cycle focuses on the cyclical nature of the teacher’s role through the various production stages. It includes teacher modelling, and explicit teaching of relevant semiotic knowledge and the metalanguage of meaning-making in different modes, as well as required skills for effective use of any technology, used. 

Textual knowledge, both semiotic and genre, as well as technological knowledge required need to be explicit, stated and incrementally taught (Christie and Macken-Horarik, 2007). Competent digital authoring requires coherent and systemic levels of pedagogical attention and support, in the same ways that writing is taught and valued in schools (Burn, 2006).

The TLC involves four key stages which incorporate social support for creating multimodal texts through varied interactional routines (whole group, small group, pair, individual) to scaffold students’ learning about meaning-making in a variety of modes and texts.

These stages are:

  • Building the context or field – understanding the purpose of the text and the context (genre) and building a shared understanding of the topic
  • Modelling the text (or deconstruction) – the use of mentor or model texts to focus explicitly on the structure of the text, identify the modes used and the different semiotic resources used in each mode, examples of meaning design choices made in different modes, how modes work independently and together to shape meaning and to build a metalanguage
  • Guided practice (or joint construction) – teachers and students jointly construct a text
  • Independent construction – students’ independent composing of a new text. (Derewianka and Jones, 2016; Humphrey, 2017; Humphrey and Feez, 2016)

Mentor or model texts need to be carefully selected by the teacher to support the students to work within their ‘zone of proximal development (Vygotsky 1978) in developing their knowledge of how meaning is conveyed in different modes in different texts. 

Dependent on the year level, the selected text and the teaching focus, whole texts or text extracts can be used. See visual metalanguage for examples of visual semiotic resources, and the teaching and learning cycle for further guidance.

For more information on using the teaching and learning cycle with EAL/D students to create multimodal texts, see: Teaching and learning cycle for EAL/D learners.

Resources to support creating digital multimodal texts

  • Australian Centre for the Moving Image (ACMI): Film it - The filmmakers' tool kit
  • Creating multimodal texts
  • Education Department of Victoria, FUSE: search for Filmmaking 101
  • Education Services, Australia (ESA): Scootle (Search by keywords such as ‘create’, ‘filmmaking’, “comic’, ‘digital story’, ‘poster’, ‘blog’, ‘webpage’, ‘advertisement’; ‘design’. Refine search by year level, and subject area: English.)

Resources to support EAL/D learners to plan, draft, edit and publish in multiple languages include:

  • human resources such as teachers or support staff with knowledge of the language, same language peers or students from other classes, family or community members
  • text resources such as bilingual dictionaries, translation tools and software, publications or websites in the home language, and examples or models of multilingual texts

Teachers scaffold the EAL/D learners to use these resources critically and effectively in creating meaning.


Burn, A., and Durran, J. (2006). Digital anatomies: analysis as production in media education. In D. Buckingham and R. Willett (Eds.), Digital Generations Children, young people, and new media. (pp. 273-293). New York, London: Lawrence Erlbaum Associates.

Chandler, P. D., O'Brien, A., and Unsworth, L. (2010). Towards a 3D multimodal curriculum for upper primary school. Australian Educational Computing, 25(1), 34-40.

Christie, F. and Macken-Horarik, M. (2007). Building verticality in subject English, In F. Christie, J.M. Martin. Language, knowledge and pedagogy: functional linguistic and sociological perspectives. London; New York: Continuum. 156-83.

Cope, B., and Kalantzis, M. (2009). A grammar of multimodality. The International Journal of Learning, 16(2), 361-423.

Hull, G. (2005) Locating the Semiotic Power of Multimodality, Written Communication, 22(2), 224-261.

Jewitt, C. (ed.) (2009). The Routledge Handbook of Multimodal Analysis, London: Routledge.

Kress, G. (2009). Multimodality: a social semiotic approach to contemporary communication. London; New York: Routledge.

Martin, J. R., and Rose, D. (2008). Genre relations: mapping culture. London; Oakville, CT: Equinox Pub.

Miller, Suzanne M. (2010). Towards a multimodal literacy pedagogy: Digital video composing as 21st-century literacy. In P. Albers. Literacies, Art, and Multimodality. Urbana-Champaign, Illinois: National Council of Teachers of English. 254-281.

Mills, K. A. (2010). What Learners "Know" through Digital Media Production: Learning by Design. E-Learning and Digital Media, 7(3), 223-236.

The New London Group. (2000). A pedagogy of Multiliteracies designing social futures. In B. Cope and M. Kalantzis (Eds.), Multiliteracies: Literacy Learning and the Design of Social Futures (pp. 9-38). South Yarra: MacMillan.

Zammit, K. (2015). Extending Students’ Semiotic Understandings: Learning About and Creating Multimodal Texts. In P. P. Trifonas (Ed.), International Handbook of Semiotics (pp. 1291-1308). New York, London: Springer.

Zammit, K. (2014). Creating Multimodal Texts in the Classroom: Shifting Teaching Practices, Influencing Student Outcomes. In R. E. Ferdig and K. E. Pytash (Eds.), Exploring Multimodal Composition and Digital Writing (pp. 20-35). Hershey PA: IGI Global.