Image Credit: Melody Slot Machine
Inspired by both Mozart and a children’s toy at IKEA, Masatoshi Hamanaka and his team developed “Melody Slot Machine,” a device able to generate unique melodies at the touch of a dial, and presented their creation at SIGGRAPH 2019. We caught up with Masatoshi to learn more about his inspiration, technical process, and future plans for the technology.
SIGGRAPH: What was the process for creating “Melody Slot Machine.” Are you or any of your team members musicians or artists yourselves?
Masatoshi Hamanaka (MH): All of the almost 30 people who contributed to this work have some musical or artistic ability. Three of the musicologists on our research team are particularly special: They make up the only group in the world that can analyze and compose music using a music theory called Generative Theory of Tonal Music (GTTM). The main purpose of this project was to maximize and utilize their abilities through GTTM and transcend Mozart.
Mozart composed a musical dice game in which melodies are selected one by one through rolling dice and concatenating them. By doing this, a well-balanced and beautiful piece is created with these selected melodies in any order. Only a genius like Mozart could create such a game.
SIGGRAPH: Let’s get technical. What was it like to develop the technology in terms of research and execution?
MH: Neither I nor the three musicologists are geniuses; however, I realized the possibility of using GTTM to create something interesting like Mozart’s dice game. GTTM extracts the deep structure of music. Using this deep structure, we built a melody-morphing method that can generate intermediate melodies from two original melodies. The original melody and the melody made by the morphing method have similar deep structures. Therefore, even if you switch one part of them to the original melody or one of the morphed melodies, the whole deep structure will not undergo any big changes and the melody will remain in tact.
SIGGRAPH: How does the system switch the variations of melodies through the turning of the dial?
MH: We created a slot machine-like interface after being inspired by a toy at IKEA. The playground equipment in the children’s area of IKEA has three dials that rotate horizontally, and a different head, upper body, and legs are printed on each side. When you turn a dial, the face and legs change to those of a woman or man, and the body shape changes as well. However, regardless of the combination, the end result is still a person.
This mechanism fits perfectly in our system where the overall structure of the melody does not change, even when a part of the melody is switched. By making the dial like a slot machine and adding a lever, the overall experience is greatly enhanced.
We used deep learning for analyzing music and generating an interpolation video. Regarding the music analysis, there is a large gap between the score and the music structure of GTTM, so the music cannot be learned directly. However, in GTTM, human music knowledge is given in the form of rules, and learning is made possible by using these rules due to the simplicity of each one. This is a rare example of effectively utilizing human knowledge for network learning.
Regarding the interpolation video, the marimba song presented at SIGGRAPH 2019 has 11 variations of melodies, and we shot videos of all 11 patterns. When the slot dial is turned to switch the melody pattern, the movement at the connection part of the image becomes discontinuous and unnatural. In other words, the marimba player’s standing position suddenly changes or a limb suddenly moves to another position. Therefore, the video switches from 10 frames before to 10 frames after the boundary is cut out, and a video synthesized by deep learning is inserted in this gap. This allows the performer to move smoothly.
SIGGRAPH: What kind of equipment was needed to implement the musical aspects and the hologram?
MH: The implementation was very difficult. At first, we prepared a large monster machine and began testing it, but this method didn’t work. When you operate the dial, the future melody of the song begins before playback changes. At the same time, the video stream begins before playback is switched to another video stream. Even when using the monster machine, switching video streams took more than 0.5 seconds, and video playback stops if it’s not quick enough. Perhaps no one has ever thought of developing a special process to switch the video streams instantly.
We were at a loss, but the iPad Pro was released soon after and happened to solve all our problems, perhaps due to the CPU, GPU, and memory in the iPad Pro all cooperating so well. Using this new technology, we were able to achieve the advanced processing we needed, including the switching of video streams efficiently.
SIGGRAPH: Attendees had a lot of fun with this at the conference. Technologically speaking, what do you find exciting about the final experience you presented to SIGGRAPH 2019 participants?
MH: I think the attendees enjoyed the unusual experience of controlling the holographic performers in front of them. Everyone was able to easily understand the simple operation of selecting the future score by turning dials.
Even those who have no musical experience can enjoy playing an instrument or creating a good song. Normally, doing this requires long-term training and a lot of experience, which can be frustrating. In this case, though, attendees were able to enjoy playing and making music without the years of time and hard work that would have to go into the standard learning process.
Moreover, with the a holographic display, you can see not only the front view, but also the side views. The sound is mixed separately for the front and side, and attendees were able to feel this intense presence, as if the sound was coming straight from the holographic marimba.
SIGGRAPH: What’s next for “Melody Slot Machine”? What kind of application do you hope to see long term?
MH: We’re making new advances little by little, most of which are still a secret for now. What we can share is that Melody Slot Machine currently enables users to control the future performance of small palm-sized performers. Next time, we might make a larger application that enables users to direct many performers.
The application here is not limited to children, but there is a large emphasis on education. Our goal is to continue the effort by extending the enjoyment of manipulating music to not only music professionals, but also ordinary people who would not otherwise be able to do so.
SIGGRAPH: What’s your all-time favorite SIGGRAPH memory?
MH: I joined SIGGRAPH for the first time in 2009 in New Orleans, so this year was my second time in 10 years. In 2009, we exhibited Sound Scope Headphones at Emerging Technologies. The headphones were equipped with a digital compass and a distance sensor on the headphones, and an interface emphasized and played a specific instrument from among the ensembles of eight types of instruments when the user’s face turned or their pose changed. Since a large space was prepared, all the instruments were lined up so that it was easy to see which instrument was being played, and the lighting intensity was varied in accordance with the volume. I remember many children enjoying the work. In fact, what we developed in 2009 is the melody morphing technique that is the basis of this year’s exhibition.
SIGGRAPH: What advice do you have for someone looking to present their work at a future SIGGRAPH conference? Will we be seeing you and your work at future SIGGRAPH conferences?
MH: Create what you’re passionate about and attendees will enjoy it. This enjoyment and the smiles I get to see from SIGGRAPH attendees is so rewarding and I definitely want to come back with my next work. It took more than 15 years from the beginning of our GTTM research to the creation of Melody Slot Machine, and this has showed me how much the old saying, “continuity is the father of success,” rings true. I plan to implement this once again in my future work, allowing ample time to properly develop my next creation, and return to exhibit again at SIGGRAPH.
Have a new technology you’d like to showcase at SIGGRAPH? Stay tuned for information about how to submit to the SIGGRAPH 2020 Emerging Technologies program!
Masatoshi Hamanaka received his Ph.D. in engineering from the University of Tsukuba, Japan, in 2003. He is currently the leader of the Music Information Intelligence Team in the RIKEN Center for Advanced Intelligence Project. His research interests are music information technology, biomedical systems, and unmanned aircraft. He received the Journal of New Music Research Distinguished Paper Award in 2005.