Wave-To-Wave

PUBLISHED ON MAY 7, 2020 — CATEGORIES: explorations

Approximation: a powerful creative idea

Looking for ways to automate creativity, I ended up considering the approximation problem, since it has been abundantly studied and contains some powerful semantics. Could creativity emerge from approximating things (i.e. optimizing objectives)? Let’s take a look at this 1591 portrait by Giuseppe Arcimboldo:

Of course, there are multiple layers of creativity here that don’t relate to approximating a body with vegetables. And arguably, the genius thought is to mash-up two prominent genres of the time, bodegón and portrait, into a single one. But still, there is something to it that could both be automated and considered creative in a way.

That could be a thought that Peter Ablinger, Winfried Ritsch and Thomas Musil at KU Graz had when they made the Talking Piano, the automated sound-based analogy of Arcimboldo’s painting.

KU Graz has been for many years an absolute powerhouse in music and technology, and I had been following their work for years. The 2014 Ferienkurse in Darmstadt had a prominent section dedicated to them, and attendees could apply for masterclasses with their staff (I was very happy to be one of them, but that is a story for another day!). Among many other things, they brought the talking piano. Here is a picture that I took at the occasion:


My approach

Some time after when I got into Computer Science, I finally knew enough to make my own overlap-save convolver, and implemented a matching pursuit algorithm on the top of it. I was now able to emulate the talking piano, but with arbitrary sounds. Here is a reconstruction of the song Sweep up the memories, from Dr. Seuss’ The Cat in the Hat, that I made using a bank of marimba sounds and the mentioned algorithm:

And here is the original:

Note that the audio results arise simply from sampling several thousand random marimba sounds and fitting them to the audio at positions with maximal correlation. I find interesting how this simple rule captures the onomatopoeic effect of the “sweep” word into an equivalent upward “sweep” (glissando) of the marimba notes. Also that in order to emulate the long string chords, the algorithm resorts to repeating shorter marimba notes (tremoli), which humans also would do.

On the top of this simple but powerful idea of approximating sounds with other sounds, another possibly more powerful one emerges:

The product of the approximation is interesting itself, but the resulting sequence could bear even more potential for creativity

I.e. the information about how to reconstruct the sound can be used to e.g. fluctuate reproduction speed or the nature of sounds, and the play on identities can be greatly extended. Click here to see an example of such a sequence, that produces the following audio:


the ALMAT installation and further work

Also, all of this can be done in real time (even recording the reconstruction sounds): In the 2017 edition of the Impuls Akademie, hosted by the KU Graz I had the chance to put some of this to practice at the ALMAT workshop hosted by Agostino DiScipio, Hans H. Rutz and David Pirrò. The basic idea, captured in this hacky SuperCollider script, was to detect salient sound events in real time, record them, and reconstruct them with past sound events.

This was consistent with the main idea of the workshop, centered about ecosystemic sonification: in a given space, several systems were put together, each one with their audio inputs and outputs. Then, during the exhibition, the systems would interact with the visitors and among each other. Here are some nice pictures of our rehearsal at the Cube and setting up before an exhibition:

Also, last but not least, this idea of approximation could be applied to more abstract levels of sound (e.g. structural instead of sound waves), opening up a lot of possibilities. This has connections with the field of sparse coding and is one of the reasons that I got really interested in deep learning (OK, neural network hype also helped). Also in 2017, a (now absolutely famous) paper came out introducing CycleGANs, a setup in which neural networks exploit this kind of multi-level hierarchical information to perform style transfer:

This boosted the whole discussion around automated creativity in a way that I think is very enriching and worth pursuing further, so stay tuned!


Original media in this post is licensed under CC BY-NC-ND 4.0. Software licenses are provided separately.