Image credit: ©️ Hakuya Labs 2024. All Rights Reserved
Let’s take a look back at a past SIGGRAPH contribution! At SIGGRAPH 2024’s Real-Time Live!, the audience was treated to an exciting look at “Warudo”, an awe-inspiring live performance sandbox redefining VTubing and interactive storytelling. To dive deeper into the technology and vision behind “Warudo”, SIGGRAPH caught up with one of its key contributors, Tiger Tang, to explore how the platform empowers creators, advances immersive storytelling, and is paving the way for the next evolution in live, audience-driven digital performance.
SIGGRAPH: How does “Warudo” enhance avatar expressiveness and audience interaction compared to traditional VTubing systems?
Tiger Tang (TT): Traditional VTubing systems often establish a strict one-to-one relationship between the performer’s real-life movements and the avatar’s on-screen behavior. But our key insight is that VTubers don’t necessarily want to replicate themselves — they want to embody a fictional character. VTubing is about becoming the persona you imagine, not just animating your physical self.
“Warudo” enhances expressiveness by layering procedural animations on top of motion capture input. From simple idle poses to stylized secondary motion driven by pendulum physics, it all blends seamlessly together, making the avatar feel more alive.
For audience interaction, traditional VTubing systems often rely on a small set of preset effects — for example, spawning a 3D gift that drops on the avatar when a donation is received. The problem is, VTubing is all about being unique: If everyone is using the same effects, they will become tiring for the audience very quickly.
“Warudo” solves this by introducing a visual programming interface that allows users to not only customize built-in preset effects with more flexibility than ever, but also create completely unique effects by mixing and matching event nodes (e.g., when a gift is received, when a message containing a keyword is received…) and effect nodes (e.g., throw a prop, play a sound effect, spawn particles…). We took a lot of inspiration from Unreal Engine’s blueprint system but made it much more accessible for nontechnical users to use.
SIGGRAPH: Can you explain how “Warudo” integrates multiple tracking systems, such as iPhones, webcams, and other at-home tracking devices, into one cohesive platform?
TT: This one is easier than you may think! 3D VTubing avatars, fortunately, have been pretty much exclusively made in the VRM format since 2021. The VRM format was originally developed specifically for VR humanoid characters, but it has slowly become the de facto standard format for 3D VTubing models. Under the hood, “Warudo” treats every tracker as a modular data provider that is retargeted to the common VRM humanoid rig; the motion inputs are then mixed and/or blended with procedural animations.
SIGGRAPH: What role might “Warudo” play in advancing immersive storytelling and audience-driven interactive content?
TT: We think of “Warudo” as a live performance sandbox — think of us like Roblox, but for VTubing. Our powerful visual programming system and plugin API allow developers to create and iterate interactive storytelling in real time.
Here are two of the most impressive examples: One turned “Warudo” into a horror boss fight where the audience needs to collaboratively “defeat” the VTuber, who has a health bar; and one turned “Warudo” into a third-person, turn-based JRPG. We often look at community content like these and scratch our heads: “How did they implement that?”

Image credit: ©️ Hakuya Labs 2024. All Rights Reserved
SIGGRAPH: How does the intuitive, node-based visual programming system empower creators, especially those with minimal technical experience, to customize and enhance their performances?
TT: Our visual programming system, or “blueprints,” boils down to “When X happens → do Y.” Need your avatar to sneeze when chat redeems the “achoo” channel reward? Drag an On Twitch Channel Points Redeemed node, connect it to a Character Play Animation node, and you’re done.
This system is beginner-friendly but doesn’t stop at basic use cases. In fact, all of “Warudo’s” mocap processing and animation logic is built with the same blueprint system that general users have access to. Advanced creators may create blueprints with a few hundred nodes that represent complex logic, while nontechnical users can follow drag-and-drop tutorials to create complex effects without writing a single line of code.
By providing a shared creative toolbox that works for all skill levels, “Warudo” encourages community knowledge sharing and lowers the barrier between technical and nontechnical creators. It’s one of the reasons we see such a wide range of content — from spontaneous meme effects to fully scripted shows — being built by our community every day.
SIGGRAPH: Aside from VTubing, what other creative or professional fields do you imagine benefiting from “Warudo’s” platform?
TT: While our current focus is on serving the VTubing community, we believe the many lessons learned from “Warudo” (e.g., expressive avatars via procedural animations, avatar behavior customization via visual programming) can be readily applied to the wider virtual production industry. We also see the potential for “Warudo” to be used in museum and art installations as interactive exhibits where visitors control avatars or objects, or in therapeutic contexts where expressive avatars help build comfort and communication, especially with children or neurodivergent users.
Want more? Watch the full SIGGRAPH 2024 and 2025 Real-Time Live! shows now on YouTube and keep the innovation going.



