June 16, 2025
With the collaborative canvas operational, my focus shifted to refining the user experience, particularly around navigation and visual clarity. The pan and zoom functionality, while functional, felt abrupt. To address this, I implemented a much smoother, animated zoom using requestAnimationFrame. Instead of jarring jumps, the canvas now glides to the target scale, creating a more fluid and professional feel. To complement this, I added a zoom indicator to the toolbar, displaying the current zoom percentage and ratio, giving users a clear sense of their viewport. The grid system also received a major overhaul. The previous static grid was replaced with a dynamic, shape-based grid that intelligently adjusts its density based on the zoom level. As you zoom in, more subdivisions appear, providing greater precision, and as you zoom out, they fade away to keep the view uncluttered. This ensures the grid is always useful, never overwhelming.
To further enhance usability, I introduced a Reset View button that smoothly animates the canvas back to its default position and zoom level. I also made the pen color theme-aware, automatically switching between black and white for optimal visibility in light and dark modes.
With the core features in place, the next major step was to enable private, shareable sessions. I transformed the landing page into a hub for creating or joining rooms, using nanoid to generate unique 7-character room IDs. To keep the UI responsive, I implemented React's useTransition hook for non-blocking loading states. I then replaced the static canvas with a dynamic [roomId] route. The key to isolating sessions was making the Ably channel subscription dynamic, creating a unique channel for each room based on its ID.
To prevent the blank canvas issue for new joiners, I implemented a state synchronization handshake. A new client requests the current state, and an existing client responds by publishing the entire canvas history, instantly syncing the new user.
Finally, to improve the sharing experience, I built a CollapsibleShareNotification for room creators, which automatically appears and provides easy copy-to-clipboard functionality for the room link and code. A crucial fix to the Ably auth URL ensures reliable connections from the new nested routes.
I updated the app's UI, moving everything to Material 3 to give it a modern and polished feel. The core logic was updated to use the androidx.palette library not just to pull a palette, but to analyze the image and find the single most dominant color.
For the UI, this dominant color is now used to dynamically theme the entire app. When you select an image, the app's colors, from the buttons to the background, regenerate to match the photo's aesthetic. I rebuilt the main screen using a Scaffold and replaced the old buttons with a ModalBottomSheet for a cleaner image selection experience.
Next up, I'm going to build a real-time color generation feature. The idea is to use the camera to constantly analyze the environment and display a live palette. From there, users will be able to lock colors they like, saving them one by one until they've built an entire custom palette from the world around them.
I started by building the foundation for a real-time collaborative whiteboard using Next.js and Konva.js. The initial goal was to get the core drawing mechanic working, so I used Ably to sync pen strokes between users. It worked, but the experience was limited; everyone was drawing inside the same fixed-size box.
The challenge was to make the canvas infinite. My solution was to scrap the static background entirely and make a dynamic grid. It hooks into the canvas's pan and zoom events to calculate and render only the grid lines visible in the current viewport. This created a seamless illusion of a boundless space.
With the core logic in place, I shifted to the user-facing side of the project. I built the main toolbar with essential tools like a pen, an eraser, and a pan control. To give users more creative freedom, I added a proper color picker and a stroke width slider. The final touches were a toggle to hide the grid and a theme switcher
I've implemented a settings screen, allowing persistent defaults for themes and compression quality via DataStore. The app now integrates with the Android share sheet, letting you send media directly from your gallery or other apps. I've also added the ability to cancel an in-progress compression.
There was a theme flickering bug during the switch from main activity to the settings that always displayed your native android theme for a split sec no matter which theme you selected in the app. I fixed it by ensuring the correct theme is loaded before the UI renders. However, now I'm tackling another issue: the 'Image Quality' slider in the settings screen suffers from input lag and delay.
I started by setting up image selection, allowing the user to either snap a
photo with the camera or pick one from the gallery. The core logic uses the androidx.palette library to analyze the selected image and pull out the six most dominant colors. I spent some time refining this, making sure to sort the extracted swatches by population to get the most relevant palette.
For the UI, the palette is displayed in a simple Row beneath the image. I made each color swatch tappable, which brings up a dialog showing the detailed HEX, RGB, and HSL values.
Next up, I've started working on a lock feature. The idea is to let users lock a color they like in the palette and then regenerate the other colors around it.
Pigment is an Android app that generates color palettes from your photos. You can either take a new picture or choose one from your gallery. The app then displays the dominant colors from the image, and you can tap on any color to see its HEX, RGB, and HSL codes.
I added persistent settings management using electron-store it now saves and loads your preferences, all of these settings are saved automatically as you change them. I restructured the main form to be more logical and intuitive; you now select the OCR engine first, and the language selection sections will only appear if you choose an engine that requires them, (Tesseract). For better control, a Cancel button now appears during the conversion process, allowing you to stop the operation at any time and terminate the background task to free up system resources. I also added a Clear button to easily reset the PDF file input and any language selections, making it faster to start a new task.
I've successfully migrated our backend from an Express server to a single Vercel serverless function. A key challenge was resolving a Spotify Illegal redirect_uri error, which was caused by Vercel's unique deployment URLs. I fixed this by using the VERCELPROJECTPRODUCTION_URL environment variable for production builds, stabilizing the authentication flow.
I started by building the foundation of the application, focusing first on getting the Spotify login to work securely. My goal was to allow users to connect their accounts as the first step. Once that was handled, I created the core game mechanic: a system that pulls two random songs from Spotify to present to the player in each round. I made sure the song selection was varied to keep the game interesting. With the core logic in place, I shifted to the user-facing side of the project. I designed the main screens, from the initial login page to the game itself, and the game over screen. I created the visual elements for displaying the songs and the player's score, and I chose a dark, terminal like theme to give it a polished look.
Most recently, I focused on making the gameplay feel seamless and fast. I implemented a pre-loading system so that the next pair of songs is already fetched while the player is still on the current round. To complement this, I added smooth animations for the transitions between rounds.
Statify is a "higher or lower" web game using the Spotify API. After connecting their account, players guess which of two songs has a higher popularity score. Correct guesses increase your score, while an incorrect one ends the round. The app features a sleek, dark-themed UI built with React and Tailwind CSS.
I've completed the major UI overhaul, fully integrating the compression features into a more intuitive and visually appealing experience.
The old, basic layout has been completely replaced with a modern, Card-based design built on a polished Material 3 theme. For image compression, the new interface now features a live preview of the selected image and a dedicated quality slider, giving you fine-grained control over the output.
For video compression, the technical dropdowns are gone. In their place are user-friendly preset cards, making it easy to choose the right balance of quality and file size. This new UI is fully connected to the recently upgraded backend, leveraging the intelligent image scaling and refined bitrate calculations to give you more effective and accurate compression than before.
I've finished a significant update to the core compression engine. The backend logic is now much smarter: image compression now includes intelligent scaling to reduce file sizes more effectively, and video compression has a refined bitrate calculation that can accurately target a user-specified file size. The system is now fully equipped to handle presets and quality settings from the UI. I'm now in the middle of a complete UI overhaul to make the app more intuitive and visually appealing. As the screenshot shows, I'm replacing the old, basic layout with a modern, Card-based design. For images, this new UI displays a preview and a dedicated quality slider for fine-grained control. The next step is to finish the video compression interface, which will feature user-friendly preset cards instead of technical dropdowns, all built on a new, more polished Material 3 theme.
I began by setting up the bare bones of the application: a new Android project using Kotlin and the modern Jetpack Compose for the UI.
My first real task was to make the app do anything at all xD. I needed to let the user actually select a file from their device to compress. This was the first major step, turning the app from a concept into an interactive tool. Once a file was selected, I made the UI update to show its name
I implemented a straightforward process: decode the source file into a Bitmap, then re-compress it as a JPEG with a lower quality setting. After some research, I decided to use the androidx.media3.transformer library. Getting the transformer to properly re-encode the video to H.264 and resize it to a smaller resolution was tricky. It was an asynchronous process, so I had to hook into its listeners to know when the job was done or if it failed.
Then I added an progress screen with a loading spinner. To make it even better, I added a progress monitoring loop that checks the status of the Transformer, allowing me to display a real-time percentage to the user.
The last thing I did was put the finishing touches on the user experience. I wrapped the whole interface in a Material 3 scaffold.
An intuitive Android application for efficiently compressing images and videos to save storage space. Built with Jetpack Compose and Material 3, it integrates with the Android share menu for easy media import and allows for quick sharing of the compressed result.
I implemented a new wireframe rendering mode to offer an alternative way of visualizing STL models. It uses an efficient, Numba-jitted version of Bresenham's line algorithm, which forms the core of the new feature. It simply draws the edges of each triangle in the mesh. To make it user-friendly, I integrated this new capability into the main rendering pipeline and added a 'w' keybinding to allow easy toggling between the standard shaded view and the new wireframe display.
The goal was to integrate new, more powerful engines to give users greater flexibility and improve text recognition quality.
The first new addition was a more experimental feature: AI Vision OCR. This engine connects to a locally running AI model (via LM Studio) with image analysis capabilities. In this mode, Scannio sends the rendered PDF pages directly to the model, which looks at them and interprets their content.
Next, I integrated PaddleOCR. Instead of sticking purely to JavaScript-based solutions, I opted to run an external Python script. This allows the application to leverage PaddleOCR's advanced capabilities, which should significantly increase precision, especially with non-standard documents. The entire operation is handled by the dedicated worker thread, ensuring the user interface remains fully responsive.
My first major task was converting PDFs to images for OCR. I initially tried a Node.js wrapper for Poppler, but it was slow and inefficient at saving files. I then switched to pdf-to-png-converter, which was better, but PDFium, the engine from Google Chrome, proved to be the fastest and most effective solution.
With the conversion method decided, I started working on the img extracting and OCR, Then I moved the PDF processing and OCR to a separate worker thread to keep the user interface responsive, even with large files.
Currently, I'm exploring some post-processing steps to automatically correct common errors, fix weird formatting, and remove other junk the OCR process leaves behind.
Scannio is a desktop app that converts PDFs to EPUB or TXT files. It intelligently handles both regular and scanned PDFs by using multiple OCR technologies—including on-device Tesseract and PaddleOCR, as well as cloud-based AI services—to accurately extract text.
First, I tackled a bug where dragging clips sometimes showed a dark vignette overlay in Chrome or Edge. To fix this, I reworked the drag preview logic: now the custom drag image is always clean and consistent, you never see the browser’s fallback overlay.
Next, I made the timeline preview and layout more compact by reducing unnecessary vertical space, which helps the UI feel less cluttered, especially with lots of tracks.
One of the biggest usability improvements was adding dynamic zoom for the timeline. There are now intuitive zoom in/out buttons in the header and you can hold Shift and scroll to zoom in and out. Every timeline element—including the ruler, clips, and preview—scales smoothly with the zoom level, so you can easily switch between fine detail and big-picture views.
I also introduced a solo feature for both tracks and clips: now you can solo any track or clip from the context menu, which mutes all others and highlights the soloed item with a yellow border. This is super helpful for focusing on specific parts of your project while editing or mixing.
I fixed an issue where deleting a track sometimes left its clips' audio playing in the background. Now, deleting a track reliably stops and cleans up all its audio players.
Finally, I made the context menu more consistent, the right-click event
handler was not attached to the No tracks yet message, so the menu only appeared when clicking on specific areas of the timeline.
First, I fixed alignment and syncing issues with the timeline ruler and sidebar collapse, making layout changes more robust. Next, I restored the playhead line and improved how the timeline updates its dimensions using ResizeObserver.
Then I added a right-click context menu to clips, allowing for quick renaming and deletion via modals, making clip management much smoother. The timeline ruler is now sticky, and the preview toggle button is more visible in the ruler bar.
Dragging clips is now much better: the timeline auto-scrolls and expands if you drag to the edge, and drop positions are more accurate. Clip drag performance and feedback are also improved by leaving the ghost clip preview and smoother visuals.
I completely redesigned the audio control sidebar and level meters. I replaced the old channel rack with a new, collapsible sidebar that includes a full volume mixer with mute controls for master and individual tracks. When collapsed, it shows a simplified view with master controls (volume, level meter) and key project stats like track, clip, and duration counts. The level meters also got a visual overhaul for better readability.
I wanted to make the entire sidebar dynamically scale with different browser sizes I tried and failed to do so, but I ended up only implementing responsive height for the level meters. However, I wasn't happy with how it looked, so I reverted that change.
After that, I fixed a bug to ensure the project duration was calculated consistently between the new sidebar and the export options
I fixed the jittery clip movement to enable smooth cross-track dragging, then tackled the timeline by fixing the dynamic ruler that wasn't scrolling properly and adding smooth playhead auto-scroll. I built a collapsible timeline preview navigator for better project navigation, implemented right-click context menus for track management with rename/delete modals, and fixed the playhead visual bug so it extends across all tracks.
The project started as a collaborative, in-browser DAW. I set up the foundation with React and Tone.js, then immediately built the real-time sync engine for the sequencer and timeline.
After that, I focused on core features, adding recording controls, a playhead indicator, and fixing the clip drag-and-drop logic. As the app grew, I had to overhaul the UI rendering to fix some visual sync issues and make everything feel polished.
Finally, I wrote a proper README to document the project. It’s been a strong start, and with my teammate helping out, we’re now looking to expand the instrument selection.
Inspired by FL Studio, PeerStudio is a collaborative, web-based digital audio workstation for music production. Built with a lot of modern technologies, love and some elbow grease, all within a modern browser interface.
The first major step was implementing the new pre-roast scanner animation. The primary goal was to transform the initial data-loading phase from a static waiting period into an entertaining and interactive experience. This feature cycles through a selection of the user's top albums and recent tracks, and to power it, a new API endpoint was created to generate a unique, snarky comment for each item as it appears.
After building this core feature, I shifted focus to refining the final screen where the roast is delivered, ensuring the layout was clean and the presentation was impactful.
Finally, to tie everything together, I integrated framer-motion. This was a crucial step to ensure the transitions between the different stages of the intro sequence were seamless. Specifically, it smoothed out the flow between the initial Welcome message, the Getting your data... text, the scanner animation itself, and the final Alright, I've seen your data. Ready to face the music? prompt. This added the necessary layer of visual polish, creating a cohesive and professional-feeling experience without any jarring jumps between states.
I reworked how the AI-powered roast actually talks to users. I wanted the experience to feel much more like a real, snarky conversation, rather than a set of canned responses. This meant a full rewrite of the flow that generates questions and handles user replies. Now, the system uses AI to come up with personalized, witty questions on the fly and stitches together the conversation based on both the user’s data and their answers. The logic for image-based choices, sliders, and multiple choice questions all got smarter, and the responses are shorter, punchier, and much funnier. There’s even a series of custom intro, “no data”, and outro messages, all written to roast or dismiss the user in creative ways.
Finally, I gave the entire UI a makeover to match the new personality of the app. The look is much cleaner and more distinctive, with a premium red, warm off-white, and cool gray palette, and elegant typography using Playfair Display and Montserrat. On top of that, I added a typing animation for the AI’s messages, making the conversation feel more alive and interactive. The result is a much more engaging and visually cohesive experience, one that feels both playful and polished from start to finish.
The project started as just a simple idea. I began by setting up the bare bones of the application, authorization, and fetching the data from Last.FM's API
My first real task was tackling a security issue. I needed to connect to the Last.fm API to get user data, but I had to make sure my API key was kept secret and not exposed in the browser for anyone to see. I quickly reworked how the app gets its data, creating a secure, server-side channel that protected my key while still fetching all the music information I needed.
Once the connection was secure, I turned my attention to the data itself. The Last.fm API sends a lot of information, much of which I didn't need, like image links, tags and lists of similar artists. To keep things fast and focused, I started trimming this down, stripping out the unnecessary fields. I then cleaned up the artist bios, which were often messy, to make sure the data the data for AI was neat and consistent.
Then came the most challenging and exciting part: bringing the AI to life. My vision was for an interactive experience where the AI wouldn't just spit out a generic roast but would actually engage with the user without having a large database of prepared roasts and jokes, like the judge-my-music on pudding.cool does. Getting the AI to properly ask all the different types of questions was tricky. The last thing I did was write a detailed README file.
A brutally honest AI that judges your music taste based on your Last.fm listening history. This web application connects to your Last.fm account, analyzes your listening patterns, and delivers a personalized, snarky roast of your music preferences through an interactive, conversational AI. It's inspired by Pudding.cool's "Judge My Music" but with extra sass and AI-powered burns, created because the original no longer works reliably.
The biggest issue I tackled was a devastating error calculation bug that was completely ruining error detection. The original algorithm used a character-by-character comparison, which had a catastrophic flaw: after switching views or toggling the smooth cursor setting, typing anything would cause every subsequent character to be misaligned and counted as an error. This also caused the Net WPM to register as zero for the rest of the race. The bug was particularly problematic because it seemed like a UI issue when it was actually a fundamental calculation problem. I completely rewrote this using the Levenshtein distance algorithm, which properly calculates the edit distance and counts only actual mistakes rather than alignment issues. (The attached video demonstrates the first bug)
I also solved a frustrating cursor line-wrapping bug where text would flicker and wrap erratically during typing. I implemented a new smooth cursor system, eliminating layout interference entirely. Both cursor modes now use stable positioning logic.
For the AI text generation, I enhanced the prompts to be more specific and to strip unwanted formatting like quotes and introductory phrases. I also added an animated countdown system with scaling animations to make race starts more engaging and professional.
Finally, I created documentation with a detailed README covering features, and technical implementation
I transitioned the scoring system to Net Words Per Minute (Net WPM) to provide a more accurate and fair assessment of typing skill by factoring in errors. Alongside this, the post-race screen now clearly displays Net WPM and total errors, it gave it a complete overhaul. The old summary screen has been replaced with a much cleaner leaderboard that ranks players by performance. I also improved the personal performance chart. The hardest task I tackled was fixing a desynchronization bug that occurred when toggling the Scrolling View mid-race. My previous implementation would destroy and recreate the UI, causing the user's typed text to be lost visually and marking all subsequent characters as incorrect. To fix this, I refactored the logic to use a single, persistent text input that simply moves between two views. Now, instead of recreating everything, I just toggle the visibility of the views, which preserves the race state perfectly and makes the feature seamless and stable. Finally, I fixed a visual bug causing player cards to misalign. The issue stemmed from a conflict between grid styles in the static HTML and the dynamic JavaScript. I resolved it by removing the layout classes from the HTML, giving the JavaScript full control over the grid, which now aligns the cards correctly.
First, I updated the networking to support multi-player lobbies, so you can now race with more than just one friend. This involved creating a more robust system for the host to manage the game state and broadcast updates to everyone in the room.
I also added a fun new way to generate race texts. As the host, you can now write a prompt and use AI to generate a unique paragraph for everyone to type. Of course, you can still paste in your own text if you prefer.
The biggest new feature is the Scrolling View mode. You can enable it in the new settings panel. Instead of everyone having their own text box, all players now appear as cursors on a single, scrolling line of text, making it feel like a true head-to-head race.
Finally, to help you track your progress, I added a post-game performance graph. Once you finish a race, you'll see a chart showing how your WPM and accuracy changed over time.
First, I got the basic typing mechanics working.
Then, I knew it needed multiplayer to be fun, so I added a lobby system. Now you can create a game and have friends join with a code. Seeing everyone's progress in real-time was a key feature I added next.
The initial design was pretty basic, so I gave it a complete makeover with Tailwind CSS for a much cleaner, modern feel. I swapped out all the clunky browser alerts for sleek modals, added a proper game-over screen, and put in a countdown before each race to build anticipation.
After fixing a few bugs and polishing the gameplay, the project has evolved from a simple typing test into a fun, real-time multiplayer race.
A real-time multiplayer typing game built with vanilla JavaScript and WebRTC. Race against friends to see who can type faster with live progress tracking and instant feedback.
Been wrestling with the frequency detection, The original autocorrelation algorithm kept getting fooled by harmonics and giving me wildly inconsistent readings when measuring belt tension. When you pluck a printer belt it creates this complex signal with overtones, and the algorithm would latch onto the stronger harmonics instead of the actual fundamental frequency I needed.
Added an FFT-based algorithm as an alternative that cuts through the harmonic confusion much better. It has more jumps in the readings, but it's way more reliable at finding the true fundamental frequency. Now users can switch between both algorithms depending on what works better for their setup.
Just finishing and polishing my Terminal Craft YSWS submission! First I implemented the complete file browsing system with navigation and deletion capabilities, turning it from just a viewer into a proper file manager. Then I fixed an annoying freeze bug that happened when you tried to cycle through a directory with only one STL file. Finally I added filtering to the file explorer so it only shows STL files and directories, cutting out all the useless clutter.
A browser-based 3D printer belt tension meter that uses your device's microphone to measure belt frequency when plucked. Built with JavaScript and Web Audio API after existing tools failed to work on my phone while tensioning my Prusa Core printer. Features real-time frequency analysis, visual tension gauge, and runs directly in any browser with no installation required. Simple, reliable solution when other belt tension apps don't work on your device.
TermiSTL is a cool little Python tool that lets you view 3D STL files as ASCII art right in your terminal! Instead of opening heavy 3D software just to peek at an STL file, you can now see your models rendered as text art with interactive controls - drag to rotate, zoom in/out, and switch between different views. It even shows you basic info about the model like dimensions and volume.
This was widely regarded as a great move by everyone.