logo

auricular.ai

The following article has been written by a human:
Whilst I love the multitude of tools AI offers I want to be clear about what content has been and has not been created by an AI model. I am by no means sitting on the fence between human-created content and AI-created content. Both fields complement each other and cause me only to strive to improve my skills, perspective, and insight in both. I feel I can confidently sit purely in both places. My intention is not to cause division but to express and document my experience of this remarkable scientific and creative revolution.

Entry 001: It's All About Stability

October 2022 was a deep dive into artificial intelligence and the catalyst was Stability AI. Their talk1 in October got me straight onto Huggingface2, one of the software platforms for machine learning and computation, and straight into using Stability AI’s Stable Diffusion3, a text-to-image model.

I was instantly creating images that I had only seen in the media and on YouTube and ideas started flowing. I found the bulk image process function and realized it would be possible to quickly create animations and videos. I used Open AI’s chatbot model4 to create an interview script, recorded my voice as the two actors, and used Stable Diffusion to create the frames for the video. The result was quite fascinating5, to say the least.

Listening to Ema Mostaque, the founder of Stability AI, I was immediately inspired and felt reassured my desire to engage with these early days of AI emergence was a smart investment of my time and make use of my slowly cultivated skills.

“What I care about is AI that goes global to change our systems for the better. Our mission at Stability is to build a foundation to activate humanity’s potential.”
- Emad Mostaque - Founder and CEO of Stability AI

It quickly became clear that Stability AI was different from Open AI, which up until then had been my only hands-on experience of what AI was and could do. The impression I was getting was that Stability AI indeed cared about individuals, cared about skills, about talent, and genuinely cared about the positive impact of AI.

“This is the infrastructure of the future does it make sense that the most powerful technology in the world is controlled by a few with no accountability? No, it should be owned and controlled by everyone.”

The driving force behind Auricular is my passion for creating amazing audio datasets so my ears pricked up when Emad also spoke about the importance of data and of the quality of data. I finished watching the presentation and was left with a realization that we are at the beginning of a revolution.

My introduction to Stability AI really got me thinking about the meaning of stability and how it relates to AI. I wondered about what I aspire to achieve in the field. This search for meaning led me to mathematics, biology, and even science fiction. I started to see similarities across disciplines that showed that stability really relates to the small changes that steer something away from its path and the ability for it to come back to what it is.

In my discovery of mathematics, I found that stability theory refers to results and systems remaining stable while enduring small changes to their state6. The complexity of what I read was like watching a master of martial arts performing sequences of moves at lightning speed. I can understand there is movement, there is precision, and there are rules, but to explain that or execute that myself exists in a place outside of my knowledge and experience. What I did find was that I may be able to derive some insight from this and apply it to the way I capture metadata to accurately label the intricacies of audio for datasets.

Speaking of martial arts masters, Lex Fridmen is a source of knowledge and inspiration for me. I came across his podcasts7 which range greatly in topics from AI, to chess, to the existence of aliens, to politics, science and even martial arts. Lex himself, an accomplished practitioner of Brazilian Ju-Jitsu led me to stumble upon the concept of joint stability. I think having an arm wrestle with a terminator, and not as part of a Hollywood film, would illustrate how instability arises from the breakdown of key subsystems. Of course, the snapping sounds and gory images will entertain viewers, but it may go unnoticed that those subsystems, such as the tissue and tendons that support the bone, are what fails first. From the outset of dataset creation identifying those subsystems could ensure the stability, and ultimately the accuracy, in its entirety.

Lex was also speaking with Andrej Karpathy8, a deep learning and computer vision specialist who previously was the director of AI at Tesla. Their conversation covered aspects of AI including neural networks, transformers, autonomous driving, and artificial general intelligence. What struck me was when Andrej was talking about how the written text as a form of data, does not explicitly contain all the information to accurately train a model.

“Text by itself I’m a little bit suspicious about. There’s a ton of stuff we don’t put in text in writing because they’re obvious to us, about how the world works and the physics of it. Text is a communication medium between humans and it's not an all-encompassing medium of knowledge about the world but as you pointed out we do have video and images and we have audio. And so that definitely helps a lot but we haven’t trained models sufficiently across all those modalities yet. So I think that’s what a lot of people are interested in.”
- Andrej Karpathy

Another standout feature of stability I discovered is resilience in ecology. Commonly used when referring to ecological stability it is the capability of an ecosystem to return to equilibrium after being influenced to shift from its normal state9. I couldn’t help but relate this to the human species and contemplate how capable we are of remaining stable while being influenced by intelligent machines.

On a final note, I found a short story titled Stability, by Philip K. Dick10. The main character in the story discovers a time machine and his journey through time evidently creates a bootstrap paradox, with each part of the journey appearing to be predestined by the cause of its previous action. We finally find him awakening in a city where people know only to live for “their machines”, unaware of life ever being any different. This controlled state of “Stability” has been set to prevent the decline of the civilization. This a simple reminder that while artificial intelligence may one day have humans unknowingly serving its needs it could very well be looking out for our best interests!

Written by Ezra Szandala - 31 October 2022



Footnotes
1. Stability AI talk - https://www.youtube.com/watch?v=1Uy_8YPWrXo
2. Huggingface - https://www.hugginface.co
3. Stable Diffusion - https://huggingface.co/spaces/stabilityai/stable-diffusion
4. Open AI Playground - https://beta.openai.com/playground
5. Interview with an Alien Part One - https://www.youtube.com/watch?v=0PPoIl62g-I
6. Stability Theory - https://en.wikipedia.org/wiki/Stability_theory
7. Lex Fridman YouTube - https://www.youtube.com/channel/UCSHZKyawb77ixDdsGog4iWA
8. Andrej Karpathy: Tesla AI, Self-Driving, Optimus, Aliens, and AGI | Lex Fridman Podcast #333 (30 October 2022) - https://www.youtube.com/watch?v=cdiD-9MMpb0&t=693s
9. Ecological Stability - https://en.wikipedia.org/wiki/Ecological_stability
10. Stability Short Story - https://en.wikipedia.org/wiki/Stability_(short_story)