Abstract
Conversations exhibit significant variation when different styles are employed by participants, often leading to subpar performance when a dialogue model is exclusively trained on single-style datasets. We present a cost-effective methodology for generating datasets featuring multiple conversational styles, which can be used in the development of dialogue systems. The methodology only assumes the availability of a knowledge base for a certain conversational domain, and leverages the generative capabilities of large language models to produce dialogues in a particular style. In a pilot study focused on the generation component of task-oriented dialogues, we extended the well-known MultiWOZ dataset to encompass multiple style variations, and generated a new multi-style dataset containing diverse styles while retaining core dialogue properties. Our findings highlight two key experimental outcomes: (i) novel, multi-style resources pose challenges for current single-style models, and (ii) multi-style resources enhance the dialogue model's resilience to stylistic variations.