Talk with Vox

Allwire has always fancied itself as an infrastructure company with a development problem; as ex-Microsofties no amount of infrastructure projects will ever diminish our love of code.

In April of 2023, OpenAI selected us to be part of the first round of ChatGPT plugin developers, we faced a daunting task. Design a plugin in 3 weeks that people would actually want to use with a fascinating new AI tool called ChatGPT 4. Although we were heavy users of ChatGPT 3.5 at the time, we had no idea what we wanted to build.

Reese said “video search” and TubeGPT was born. However, six hours before we were set to be approved for the first round of releases, OpenAI informed us we couldn’t use “GPT” in the name! It was emergency rebranding time, frantically changing all of the assets to the new name “Voxscript”

Cool.

That didn’t matter to us much though, as we were starstruck to be able to develop a plugin which might sit next to Wolfram Alpha and Zapier, two at launch plugins we had been staring at on the store for a few weeks.

We got to work (on the weekends of course, we have wonderfully understanding clients!), and Voxscript quickly evolved as one of the first C# AI based applications. With everything AI being written in Langchain at the time, we spent a lot of time pouring over the Langchain code base with ChatGPT as our copilot converting code into our own C# library for video transcription and analysis.

The idea of Retrieval Augmented Generation (or RAG) wasn’t commonplace yet, and we really didn’t have a firm understanding of vector based text searching outside of a few experiments in Langchain. Additionally, there was no clear path to monetization on the OpenAI store; so we kept memory/cpu usage low and opted for static textual analysis/caching of video transcripts along with US Equity analysis features and web browsing for our initial set of features.

In short, our goal was to keep theoretical maximum utilization at levels that could serve millions of users while still running on spare datacenter hardware, although we had no expectation of how many hits we’d get daily in May of 2023.

The context length of GPT4 was only around 4k tokens, not giving us a lot of short term memory to work with so we used techniques such as word stemming and cache search to minimize hallucinations when the context buffer was exceeded.

Voxscript went on to become the 2nd most popular plugin on the store behind a PDF Reader (these were simpler times) and we kept releasing new versions.

We always wanted to do more with Vox, and we know we couldn’t do that with the spare CPU cycles we had around the datacenter…

Once the initial shock and newness of the fact you could now converse with your videos wore off, it became evidant that Voxscript needed greater context windows and RAG (Retrieval Augmented Generation) to really deliver a rich experience allowing users to really interact with videos over 20 minutes, for example.

After a brief development hiatus to wait for C# tooling to catch up, Microsoft finally released beta 1 of the Semantic Kernel and we got to work rebuilding Voxscript on Microsoft Teams.

We also have created an API where folks can use our API to create their own GPT actions, in Python, or semantic kernel. Details here

Our initial beta is out in the Teams App Store in Early Access today:

Teams App Store

You can also find our API and documentation on our Github:

https://github.com/Voxscript/voxscript-demos/

Shameless Plug: Have a C# based AI project that Allwire might be a good fit for? Feel free to get in contact with us!

Leave a Reply Cancel reply