June 16, 2025
Added a Settings page and finalized everything for an iOS release. Did some UI fixes and updates to make everything look extra nice. Troubleshooted bugs, and also added expo-updates to keep everything up to date with OTA releases without having to go through storefront pipelines and waiting times (good for small bugs/UI changes)
Created a Launches page, where you can track upcoming rocket launches and their payload. I also added a button that directly puts the event in your calendar so you don't forget.
Since the API has a rate limit, I implemented caching with AsyncStorage to poll once every 10 minutes. Shout out to the Launch Library 2 API!
Finally was able to make everything into a docker image, and all tests pass in a container running linux!
After doing some research, I was able to include a faster (albeit more limited) implementation of openai-whisper called faster-whisper, based on CTranslate2 and able to use quantization. It also uses less RAM (!!!) and less CPU time. This is under a new endpoint because I have not done enough testing with it, so it is considered a BETA feature for now. I also included tests for this and made the endpoint largely the same except for the exclusion of a language field.
After fiddling around with language support and ensuring that the API does not use a ridiculous amount of resources, I was finally able to improve the transcription feature and write tests for it. It now supports translation (albeit only with the 'tiny' model) and is covered by tests.
After a long fight with ffmpeg, I finally found compatible codecs and ffmpeg args for all conversions that this platform plans to support.
I also wrote 172 tests (most are parametrized) for the API, file handling (upload, download, deletion) and file conversion (images, audio, video). In addition, I moved image conversion over to PIL instead of ffmpeg, improving performance.
TODO: write tests for transcription, implement cleanup of old files
After making some edits to the openai-whisper codebase and submitting a PR, I finally have some core features complete in the API:
Website + API powered by ffmpeg, openai-whisper, pillow, and other packages that provides advanced media tools over the web. Features include file type conversion, audio language detection, and audio transcription.
Finished website, including adding information about app offerings, adding mockups for iOS/Android, fixing up the header, adding a download section, and optimizing the entire site for mobile as well.
Wrote the code for the first page, integrating React Bits components.
Website for my mobile app, Quasar
This was widely regarded as a great move by everyone.