Since making people take showers is borderline impossible in the tech field. I've decided to solve the 2nd most prevalent problem.
And yes it's SPEAKING. Way too many tech people are afraid of speaking, so much so that. They think if they tried to defend their parking ticket in court,
they would end up with 3 life sentences, 407 years of imprisonment with no parole + treason.
I'm not claiming my project can help you win court cases but hey at least the judge will hear you better and hopefully you get 406 years instead of 407...?
My web app transcribes your speech and displays a bunch of statistics and help you identify your general shortfalls.
emori
Check their project out: 3D renderer using nothing but turtle (for rendering) and math (for computing).
Mov
Check their projects out: Constrictor, as the body falls, Emoji Dojo
Once you ship this you can't edit the description of the project, but you'll be able to add more devlogs and re-ship it as you add new features!
Okay so i've removed a bunch of metrics, i'm gonna revamp the speech update metric tomorrow and have also fixed a couple of bus related to waveform.js that would throw errors everywhere xd
I've fixed the bugs related to the wavesurfer plugin, it refused to add the pause regions, weirdly enough most of the docs related to it were showing conflicting info on each version so i decided to roll it forward and use the doc which had info on an ENTIRELY different version. Turns out that worked lmaoooooo
Currently stuck in a very vile, cruel, humiliating error. It says it's undefined??? defo gonna sue wavesurfer devs for this
Bet you wouldn't expect viva la vida lyrics to be used to test my app's waveform function but HEY HERE WE ARE it's 2025 so we've to mix up our test data a bit uk what im sayin
i'm well under way to creating the waveform chart which im planning to use to display the pauses + intonation at some point BUT for now in the immediate future, pauses it is.
oh my goodness i think i actually figured out holy, a custom pitch analysis algorithm using parselmouth DAYUM, It's actually really reliable as well i'm so proud of myself. Math really does help lmfaooo
Please save me from parselmouth this thing is NOT for the weak. Python data analysis in general is alot of trial and error and i've been scratching my head as to why my values are so high. ai + obscure libraries is a recipe for disaster ~Mov
Also forgot to mention, i had this feature for a while now and i think i did mention it BUT i never talked about it at length in a devlog. Introducing recordings page where you can see upto 5mb's worth of your past recordings :D (speech rate is a placeholder in the recordings section at the moment, don't flame me)
the about and home page looked really desolate without any charming colors so what did i decide to do? SLAP ON SOME GLOW EFFECTS ngl it looks ai generated but i promise it's not
okaaay that may have taken alot more time than i anticipated but hey, was able to make a MAJOR update to my UI color scheme. It looks way cleaner + the orange and deep dark blue make a very nice combo hahaa
Didn't get to post a devlog because SoM was getting Ddosed or something not too sure. anyway i've decided to scrap gradients because people said it gave off vibecoded vibes
okay, now to replace my voice activity detection algorithm with an interesting little library from python (webrtcvad). It's time to be great, i'm halfway through the implementation and i hope it's gonna be worth the time
WOOP WOOP got the confidence score metric down (very amazing accuracy). let's hope i can continue with the same pace for the rest of the metrics muahahahahah
i'm just gonna hope that the metrics i've built rn can actually hold true for long speeches, they seem fine for minute long speeches but hey, time will tell
okay so i've found out AI is also not good at coding conditionals. Pain in the ukw im gonna say, anyway im sure you guys are bored of JUST coding updates so here's a 100% ai generated banner that i'll be using
Removed ALOT of sloppy comments from AI, i actually hate AI but oh well i'm glad i'm refactoring everything by myself
Spent some time focusing on the UI for mobile, if it's a full stack app i gotta cover all the bases yk what im sayin?
Motivation is a big aspect to improvement so i thought of giving a little push while logging in or registering. I've implemented a dictionary implementation that randomizes a bunch of preexisting quotes and displays them on a random basis. Here's a sick demo of that happening
Rise and shine guys, i just dropped an insane update where i hotfixed many components to ensure none of you guys send infinite transcription requests
Made the modular navbar, now works across all pages except the try it and login yessirrr. did this while im in a car btw
cheeky comparison between the old and new home page (new on top, old on the bottom). Surprisingly i made the new one by myself and the old one is by AI bro....
Wasted alot of time experimenting with different styles here and there ngl and I'M STILL NOT COMPLETE urgh at least there are no programming conflicts like there was when i used AI
Reformatted and restructured the about page, here's a video snippet to show how cleaaaan my code is now
Okay dayuuuum fresh look for the about page, minor tweaks here and there + fixed a few bugs. The usual uk?
Ngl i dont like that AI has ruined my reputation, I've decided to stop all AI use as i've noticed many design inconsistencies and general ugliness in my code ngl... gonna be refactoring ALOT of my code
had some issues with implementing the profile tab but a solution appeared in my dream. Henceforth i shall commence working on DB integration. Currently it sends all the recordings to my DB but i need to work on fetching it based on the user credentials.
Managed to make the transcription faster, updated a few stuff on my python backend and built my own auth system instead of using auth0
Made a login page, planning to host all the audio recordings in MongoDB + Azure blob storage. It's looking cool! Gonna use the login page as a launchpad for a full blown authentication system
Successfully added a bunch of working speech metrics. A bit of a hassle to manage the authentication and the saved speech data. Was able to host everything on an Azure VM (yes including token decoding and the transcription). Shifted all calculations to the front end.