Week 15

Challenge 03

May 06 2022

Idea

Together with Jeremy, I wanted to build a generative storytelling system, including some procedural visuals to go along with it, running on the Raspberry Pi Zero W.

The signal chain and components visualized in an easy-to-understand way.

Conceptual Structure

Programming flowchart showing the architecture of what would eventually grow into separate python-objects.

Timetable

Our timetable for the one week, including deliverables. This disregarded the tact

Execution

In the end, the project consisted of two separate components: the image generator and the story generator.

Story Generator

The story generator consisted of a python script employing BautifulSoup4 to crawl websites and extract HTML data from it. By going through all the <a> elements on a website, it extracts all external links and filters it for undesired ones (such as social media and Google links). After extracting the first <h2> element, it chooses a random external link and navigates to it. This way the script can autonomously navigate the web and jump from page to page, as long as he visited websites have at least one external link.

This way, every website will provide a couple keywords that can be used as prompts for GPT-3, a natural language generation network by OpenAI. We send the keywords via an API call to the network, prompting a short story to return. After an average of five seconds, a multi-line, sometimes multi-paragraph short story is returned, revolving around the provided keywords. This is an example, based on 'books' and 'dolphins':

"Once upon a time, there was a beautiful kingdom made entirely of books. The shelves were lined with all sorts of stories, from classic fairytales to modern-day adventures. The people of this kingdom loved to read, and they would often spend hours curled up with their favorite book. One day, a terrible storm swept through the kingdom, destroying everything in its path. All of the books were blown away, and the once- lively kingdom was left in ruins. The people of the kingdom were heartbroken. They didn't know what to do without their beloved books. But then, one day, a strange woman appeared and said she could help them. She had magical powers that could bring the books back to life. With the woman's help, the kingdom was soon restored to its former glory. And everyone lived happily ever after surrounded by their favorite stories once again."

Based on this relatively simple structure, a procedural storytelling algorithm evolves.

The development environment including a sample of the text output (right).

Image Generator

The image generator consisted of a Dalle-E mini instance running on Google Colab. This was accessed through an API to prompt the network for the image generation and then receive the raw image data back. To limit data throughput and speed up the generation, we fixed the resolution to 256x256 pixels. By doing this, we managed to keep the response time consistently below ten seconds. The data stream back consisted of encoded in base46 raw image codec. With the help of Pietro, we then transformed that into png format and then into bmp format in order to be integrated in the thermal printout.

Two examples generated by Dalle-E mini on the prompt "Cherry Trees".

Hardware Output

To interface with the real world, two tools were chosen: Input via the command line and a physical printout via a thermal printer, normally used for printing receipts. Through the command line, the user can specify the number of jumps and the starting URL. This is the primary input the user can do. As for the output, the links, keywords and generated story are printed out via the thermal printer, providing a continuous log of the procedural storytelling. To interface with the printer is the next immediate step to take. Right now, the text output happens via the command line.

The thermal printer connected to the Raspberry Pi Zero via UART.

Next Steps

- Finish Interfacing with the thermal printer
- Integrate image generation into story generator
- Design a protocol to deliver text and Bitmaps from the Pi to the printer
- Build an enclosure

Jeremy debugging the run environment on the Raspberry Pi Zero.

My Reflection

Ultimately, the image generator should be integrated in the storytelling algorithm to provide pictures on the basis of the keywords extracted by the web crawler. This would effectively wrap all of these components into a single content producing program. To get there is my personal mid-term goal. This might be achieved until the end of the semester, but it might as well be finished only after it. Regardless, it will be finished eventually this week and hopefully grown into a published project.

Isolated from the fact that this project was not completely finished within a single week, this Fab Challenge was of critical importance for me personally. It was the first time writing Python in a bigger scale and structuring it into dedicated objects. This implied an understanding of software architecture, which I didn't have any other touch points with before. Additionally, doing API calls and processing JSON replies in Python, was a new thing for me.

In this sense this week's challenge was hugely inspirational for me, since for the first time, I could see hoe Python could serve as a bridge between different systems, interfacing between them and post-processing data. This way, software platforms increasingly reveal themselves to me not as silos with isolated environments, but simply as inputs and outputs to be combined.

Github Repository

All our code, including documentation, can be found here.