Proactive AI From JD.com Watches Your Camera and Speaks Without Prompting
Summary
JD.com has released JoyAI-VL-Interaction, an open-source vision-language model that can watch live video and decide on its own when to speak. The model targets proactive interaction scenarios such as fall detection, security monitoring, live commerce, and industrial supervision. JD.com says the 8B model beats competing systems on timing in event-driven tests while remaining deployable on standard hardware under Apache 2.0. The release includes weights, training recipe, training data, and system code, giving developers a reproducible baseline for proactive multimodal AI.
Classifications
industries
No industries detected
applications
No applications detected
AI Classifications
Labels
No AI classifications detected