Mker

关键字

deadmanoz

5小时前
Wow, on a cursory first look this looks pretty amazing…

(and multimodal models are apparently now called “omni” modal)

… Qwen3-Omni

“capable of understanding text, audio, images, and video, as well as generating speech in real time.”

Will be adding this to the self-hosted stack I run at home to put it through its paces!

https://github.com/QwenLM/Qwen3-Omni
分享

投诉

删除

没有更多记录

更换

没有更多记录

更换

没有更多记录

发送

联系人

加载更多

登录注册>>

资料修改成功

取消确定

29%

正在上传

网络连接中