MLLMs Construction Company: Investigating Multimodal LLMs’ Communicative Skills in a Collaborative Building Task

M Sarzotti; Giovanni Duca; C Madge; Raffaella Bernardi; Massimo Poesio

Back

MLLMs Construction Company: Investigating Multimodal LLMs’ Communicative Skills in a Collaborative Building Task

Conference proceeding

Open access

Peer reviewed

MLLMs Construction Company: Investigating Multimodal LLMs’ Communicative Skills in a Collaborative Building Task

M Sarzotti, Giovanni Duca, C Madge, Raffaella Bernardi and Massimo Poesio

CLiC-it 2025: Eleventh Italian Conference on Computational Linguistics, Vol.4112

4112

Eleventh Italian Conference on Computational Linguistics (Cagliari, 24/09/2025–26/09/2025)

2025

Handle:

https://hdl.handle.net/10863/51711

Abstract

Communication

dialogue

Multimodality

3D understanding

How effective are the communication choices of Multimodal Large Language Models when pursuing a common goal? Can they make use of common human dialogical patterns? We address these questions by engaging two agents based on the Mistral model in a collaborative building task, where one has to instruct the other how to build a specific target structure. The aim of this work is to investigate whether different prompting techniques with varying degrees of multimodality can influence the performance of MLLM-based agents in the proposed task. Code and data available in the project’s GitHub repository.

Files and links (2)

pdf

2025.clicit-1.971.37 MBDownload View

Open Access

url

https://api.elsevier.com/content/abstract/scopus_id/105034261961View

Details

Title: MLLMs Construction Company: Investigating Multimodal LLMs’ Communicative Skills in a Collaborative Building Task
Creators: M Sarzotti
Giovanni Duca
C Madge
Raffaella Bernardi
Massimo Poesio
Publication Details: CLiC-it 2025: Eleventh Italian Conference on Computational Linguistics, Vol.4112
Conference: Eleventh Italian Conference on Computational Linguistics (Cagliari, 24/09/2025–26/09/2025)
Series / Volume: 4112
Publisher: CEUR-WS
Number of pages: 11
Identifiers: (UNIBZ)96803058
991007306844401241
Scopus ID: 2-s2.0-105034261961
Copyright: Copyright for this paper by its authors. Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0).
Academic Unit: Faculty of Engineering
Language: English
Resource Type: Conference proceeding
Author Names String: Sarzotti M, Duca G, Madge C, Bernardi R, Poesio M

Metrics

1 Record Views