As artificial intelligence assistants take on a growing role in customer service, shopping, and daily communication, researchers have started to look more closely at how humans speak to them. A new study by Amazon scientists reveals that our tone, grammar, and level of politeness can directly influence how well chatbots understand us… and, over time, how they learn to respond.
The study[1], titled Mind the Gap: Linguistic Divergence and Adaptation Strategies in Human-LLM Assistant vs. Human-Human Interactions, shows that people adjust their language depending on whether they are chatting with a human or a machine. That small behavioral shift, the authors found, can make a measurable difference in how effectively AI models interpret user intent.
Politeness Gap Between Humans and Machines
The research team, led by Fulei Zhang and Zhou Yu at Amazon, analyzed thousands of real customer support conversations involving both human agents and AI chatbots. They examined six linguistic dimensions… grammar fluency, politeness, lexical diversity, informativeness, clarity, and emotional intensity… using Claude 3.5 Sonnet to score each message on a five-point scale.
People were found to be 14.5% more polite and formal when writing to human agents and 5.3% more grammatically fluent. Their vocabulary richness also rose slightly by 1.4%. However, when it came to informativeness, clarity, or emotional tone, there was little difference. In short, users still conveyed their needs, but they tended to simplify their grammar and drop polite markers when addressing AI systems.
The researchers note that this behavior likely reflects how users perceive chatbots… as functional tools rather than social partners. The shift produces shorter and more direct sentences, which in turn creates a “domain gap” between training and real-world input. Models trained on polished human-to-human conversations may struggle to interpret blunt or fragmented chatbot messages, even if the intent remains the same.
Testing How Training Data Shapes Understanding
To measure how this stylistic gap affects AI performance, the team built several versions of a conversational model using Mistral 7B as the base. They first trained it on 13,000 human-to-human chat messages, then tested it with 1,357 messages written for chatbots. The model had to identify user intent from a list of service categories.
They designed four datasets to test the effect of linguistic diversity. The baseline model used only original human conversations. Another version was trained on “minimal style” rewrites… short, ungrammatical, and informal messages. A third used “enriched style” rewrites with high grammar and politeness scores. The fourth combined all three to maximize stylistic range.
Models trained on narrow styles performed worse. Accuracy dropped 2.6% for the minimal version and 1.8% for the enriched version compared with the baseline. But the model exposed to the full mix of communication styles improved intent detection by 2.9%. This result, though modest, confirmed that broader linguistic exposure helps AI adapt to the messier language patterns found in daily user interactions.
Why Rewriting at the Last Minute Doesn’t Work
The researchers also explored a second strategy: rewriting user messages automatically before the chatbot responds. The idea was to normalize the input into a cleaner, more human-like style. Yet this quick fix had the opposite effect. When chatbot inputs were rewritten at inference time, accuracy fell by 1.9%, suggesting that such transformations risk losing subtle clues hidden in the user’s original phrasing.
These findings reinforce the need to address language variability during training rather than rely on post-processing tricks. Models that encounter a wide range of communication styles early on learn to recognize meaning even in imperfect, terse, or emotionally charged sentences.
Toward More Natural Conversations
The study highlights a subtle but important truth about AI communication: it is not only about what we say, but how we say it. People instinctively simplify their language when speaking to a system they view as mechanical. Over time, that difference in style feeds back into the system’s own understanding of language.
By building chatbots that learn from both formal and casual expressions, developers can make AI assistants more resilient in real-world conversations. The work also raises questions about long-term adaptation. As AI models grow more capable of understanding nuance, people might begin to treat them more like human listeners, reducing the current politeness gap.
For now, the message is clear. The way users phrase their requests can either narrow or widen the communication divide. Training chatbots to recognize that diversity could help bridge the gap between human language and machine understanding, one polite sentence at a time.
Notes: This post was edited/created using GenAI tools. Image: DIW-Aigen. Read next:
Read next: Rude Prompts Give ChatGPT Sharper Answers, Penn State Study Finds[2]
References
- ^ The study (arxiv.org)
- ^ Rude Prompts Give ChatGPT Sharper Answers, Penn State Study Finds (www.digitalinformationworld.com)
