Please sign in to access this page
Summer of Making has an awful project explorer, it's slow to load, laggy to scroll, and there's no way to search for projects. Unfortunately SOM does not provide a nice api, so I decided to do a little web scraping to make a blazingly fast search tool that uses page ranking algorithms to find relevant projects.
Noah Jackson
Whoops! Looks like they don't have a project yet. Maybe ask them to start one?
Eucatastrophe
Check their projects out: sciolyskillz, marbles, inconspicuous authorization a 5 minutes ish game , sands
Once you ship this you can't edit the description of the project, but you'll be able to add more devlogs and re-ship it as you add new features!
stayed up till 12am doing more data analysis. today (technically yesterday), I found every project that is leaking secrets in their .env file. Surprisingly, only about 0.5% of projects leak secrets there. Its mostly database keys but I found som cool ones like a gemini key and an aws key. tomorrow, I will properly report it to staff because everyone is sleeping rn
today I did some analysis on the current dataset of projects and found the most popular websites that people put as their repo link. to nobody's surprise, github is by far the most popular, then codeberg, then gitlab, and finally a bunch of self hosters.
Spent a while tuning the core algorithms, it's still really simple right now because I don’t have all the data. Next week I will do another scrape to go from 6k to 9k projects and get a lot more devlogs. Ranking right now is hard because theres almost data enough to do better algorithms. I will try my best to make it hard for people to beat the future algorithms and rank high even if the project is mid.
Now links to searches can be shared! It was hard to link someone the results of a search, now the query is embedded in the url
Went away for a robotics competition and was expecting to see the website down. I guess the nest team managed to scale up the servers and fix a lot of the issues. Huge props to them!
Wow! Not even a day later and nest is broken again. Eventually I need to switch to a proper host for this because this is getting annoying. A $5/month plan should be more than sufficient; heck, maybe I'll try one of those serverless things.
Got it hosted! Try it out at https://searxing.hackclub.app/ it’s fairly fast and contains over 6k projects. There are still some improvements I can make to the website, but it's nearly complete. Searching is pretty good, and feel free to use the results as inspiration for your own projects.
Doing just a little bit of scraping. Also, I made a custom library to help me with parsing the HTML schema, and after a bit of debugging, it is surprisingly good. Spent some time preventing other assets from loading to not overwhelm Hack Club's servers. They might get mad if I used up a few gigabytes of bandwidth. It took only like 1h 30m to scrape all projects, and it fits in a very small JSON file. Now I need to learn how to write a good search engine and host the website. Yay progress!
Spent way too long deciding on what type of database to use, so I ended up trying a lot of them. Instead of doing the correct thing and watching a YouTube video, reading the docs, or even reading a blog post, I decided to blindly try ones that I have heard of, even if they didn't match my use case. In the end, I ended up with terrible implementations of all of them, like an SQL DB that copies everything to Python for ranking, a Redis DB that is somehow relational, etc., etc. Anyways, instead of actually reading about which to use, I will write my own in Rust that will exactly match my use case. Remember, coding for 6 hours can save you 30 minutes of reading documentation.
Very clean web scraper API. I might also use this to yoink some projects from other hackclub run sites. Figuring out how to get devlogs took some work, but all good now.
Searching mostly works, and the scraper got all the projects into one large JSON. I need to figure out how to implement some kind of ranking algorithm and what to do about the json size being 90% base64 images
Im going to use ai only on the frontend because I suck at webdev and I want this tool to actually be useful. Projects are manually copied from journey.hackclub.com and I was heavily inspired by their style. Currently nothing is interactive but give me a week and I can make this something amazing.