Please sign in to access this page

Retro search engine

Retro search engine

14 devlogs
8h 16m
•  Ship certified
Created by harry rogers

pre google style search engine.

Timeline

Ship 1

0 payouts of shell 0 shells

harry rogers

11 days ago

harry rogers Covers 14 devlogs

code clean up. plus a readme and pushed to github

Update attachment

Time to clean the mess of my project structure

Update attachment

Fix error when vpn the maxmind cant get the city name witch caused error for the weathers so just now says the country name. also news feeds worrrrkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkk.

Update attachment
Mohit Tiwari Mohit Tiwari 12 days ago
damn this looks super nostaligic!

https://github.com/yavuz/news-feed-list-of-countries/ good git repo and i have now added news sources for each country

Update attachment

FUN story for this one so i created a boosting algo it would boost if it has domain like if searched facebook you would get facebook.com well it wasnt working if you could see in past posts buttt i had a bad Index Desynchronization bug there is a bm25 file that listed all bm25 indexs and also line index file stored all the index for the file but it was off. why was it off line endings are two characters (\r\n) when i had one (\n) so it would of worked on linux or mac but on windows nooooooooooooooooooooo. so okay now it works :)))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))

Update attachment

I did something okay so i put BM25 formula. Score = sum over terms of [inverse document frequency × (term frequency × (saturation + 1)) ÷ (term frequency + saturation × (1 − normalization + normalization × (document length ÷ average document length)))]. honsetly its so confusing. and now has some more issues so il have to work on it.

Update attachment

IT WORKS. okay okay has some issues like the algo needs to be better and such. like Wikipedia is higher up when you search google but yeah. another issue is that the data on lines is greater than 1mb and having issues with then loading them as its big. but i did it. will keep you updated as i go. but enjoy a little demo i made

The data can be empty and ughhhhhhhhhhhhhhh thats why i was trying to parse metatags and such but its emptyyyyyy.

Update attachment

fun little gif to show how amazingly its going

Update attachment

okay so I got searching done. well its very basic but i hateee how slow it is so most search engines use Inverted Index. well so lets do that but it would have to go through all the code again. woop woop time to script. well so first i want to remove any failed scraping. expect alot more devlogs

Update attachment

Haven't shown some code in while so here's the latest and greatest python scraper. its moves fast about 60 sites a second.

Update attachment

scraping taking wayyy to long so i have made the script wayyy faster and better

Update attachment

The scraping taking while i been updating and working on scraper while it does its thing i worked on ui. So the news is getting though rss feeds and market through npm package to get yahoo finance. i been working on not relying on api and rate limits so i have to get location for news i using maxmind GeoLite2-City.mmdb. to get the users country and location for weather and news.

Update attachment

Step one of building a search engine is getting data. i have got a list of top 10 million sites. i built a simple python bot to scrape the meta data for each site.

Update attachment