Summary
Sending a chat message that includes an image attachment (e.g. a screenshot) crashes the chat task with TypeError: expected string or bytes-like object, got 'list'. Text-only messages work fine. The failure happens during $skill-name mention parsing, which assumes the user message content is always a string.
Environment
- cptr 0.6.1
- Python 3.11
- Skills enabled in the workspace (required to hit the code path)
Steps to reproduce
- Have at least one skill available in the workspace.
- Open a chat and attach an image (screenshot) to a message.
- Send the message.
- The task errors out immediately; the assistant never responds.
Expected
The message (text + image) is processed normally, the same as a text-only message.
Actual
The chat task raises:
File ".../cptr/utils/chat_task.py", line 1177, in run_chat_task
mentioned = _re.findall(
File ".../re/__init__.py", line 216, in findall
TypeError: expected string or bytes-like object, got 'list'
The offending content value at the time of failure:
[{'type': 'text', 'text': 'getting this error'},
{'type': 'image', 'media_type': 'image/png', 'base64': 'iVBORw0KGgo...'}]
Root cause
In run_chat_task (cptr/utils/chat_task.py), the $skill-name mention parser runs re.findall directly on last_user["content"]:
mentioned = _re.findall(
r"\$([a-z0-9](?:[a-z0-9-]*[a-z0-9])?)", last_user["content"]
)
For text-only messages content is a str, but for multimodal messages (any attachment) it's a list of content blocks, which re.findall cannot accept.
Suggested fix
Extract the text blocks before matching:
_content = last_user["content"]
if isinstance(_content, list):
_text = " ".join(
b.get("text", "")
for b in _content
if isinstance(b, dict) and b.get("type") == "text"
)
else:
_text = _content or ""
mentioned = _re.findall(r"\$([a-z0-9](?:[a-z0-9-]*[a-z0-9])?)", _text)
This preserves $skill mention detection (now scanning the text portions of multimodal messages) while fixing the crash. Worth auditing other spots in the message path that assume content is a str, in case the same assumption appears elsewhere.
Summary
Sending a chat message that includes an image attachment (e.g. a screenshot) crashes the chat task with
TypeError: expected string or bytes-like object, got 'list'. Text-only messages work fine. The failure happens during$skill-namemention parsing, which assumes the user message content is always a string.Environment
Steps to reproduce
Expected
The message (text + image) is processed normally, the same as a text-only message.
Actual
The chat task raises:
The offending
contentvalue at the time of failure:[{'type': 'text', 'text': 'getting this error'}, {'type': 'image', 'media_type': 'image/png', 'base64': 'iVBORw0KGgo...'}]Root cause
In
run_chat_task(cptr/utils/chat_task.py), the$skill-namemention parser runsre.findalldirectly onlast_user["content"]:For text-only messages
contentis astr, but for multimodal messages (any attachment) it's a list of content blocks, whichre.findallcannot accept.Suggested fix
Extract the text blocks before matching:
This preserves
$skillmention detection (now scanning the text portions of multimodal messages) while fixing the crash. Worth auditing other spots in the message path that assumecontentis astr, in case the same assumption appears elsewhere.