Retro search engine

harry rogers

1h 26m • 6 months ago

doing some new work on this is started working on adding any site so people can add sites they like. i also working on boosting english text and sites with descriptions. so you dont get a bunch of fluff websites.

Ship 1

1 payout of shell 165.0 shells

7 months ago

harry rogers • Covers 14 devlogs and 7h 27m

harry rogers

59m • 7 months ago

code clean up. plus a readme and pushed to github

harry rogers

32m • 7 months ago

Time to clean the mess of my project structure

harry rogers

12m • 7 months ago

Fix error when vpn the maxmind cant get the city name witch caused error for the weathers so just now says the country name. also news feeds worrrrkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkk.

Mohit Tiwari 7 months ago

damn this looks super nostaligic!

harry rogers

31m • 7 months ago

https://github.com/yavuz/news-feed-list-of-countries/ good git repo and i have now added news sources for each country

harry rogers

28m • 7 months ago

FUN story for this one so i created a boosting algo it would boost if it has domain like if searched facebook you would get facebook.com well it wasnt working if you could see in past posts buttt i had a bad Index Desynchronization bug there is a bm25 file that listed all bm25 indexs and also line index file stored all the index for the file but it was off. why was it off line endings are two characters (\r\n) when i had one (\n) so it would of worked on linux or mac but on windows nooooooooooooooooooooo. so okay now it works :)))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))

harry rogers

7m • 7 months ago

I did something okay so i put BM25 formula. Score = sum over terms of [inverse document frequency × (term frequency × (saturation + 1)) ÷ (term frequency + saturation × (1 − normalization + normalization × (document length ÷ average document length)))]. honsetly its so confusing. and now has some more issues so il have to work on it.

harry rogers

9m • 7 months ago

IT WORKS. okay okay has some issues like the algo needs to be better and such. like Wikipedia is higher up when you search google but yeah. another issue is that the data on lines is greater than 1mb and having issues with then loading them as its big. but i did it. will keep you updated as i go. but enjoy a little demo i made

harry rogers

5m • 7 months ago

The data can be empty and ughhhhhhhhhhhhhhh thats why i was trying to parse metatags and such but its emptyyyyyy.

harry rogers

18m • 7 months ago

fun little gif to show how amazingly its going

harry rogers

22m • 7 months ago

okay so I got searching done. well its very basic but i hateee how slow it is so most search engines use Inverted Index. well so lets do that but it would have to go through all the code again. woop woop time to script. well so first i want to remove any failed scraping. expect alot more devlogs

harry rogers

17m • 7 months ago

Haven't shown some code in while so here's the latest and greatest python scraper. its moves fast about 60 sites a second.

harry rogers

8m • 7 months ago

scraping taking wayyy to long so i have made the script wayyy faster and better

harry rogers

2h 21m • 7 months ago

The scraping taking while i been updating and working on scraper while it does its thing i worked on ui. So the news is getting though rss feeds and market through npm package to get yahoo finance. i been working on not relying on api and rate limits so i have to get location for news i using maxmind GeoLite2-City.mmdb. to get the users country and location for weather and news.

harry rogers

53m • 7 months ago

Step one of building a search engine is getting data. i have got a list of top 10 million sites. i built a simple python bot to scrape the meta data for each site.

Retro search engine

Followers

Ship Your Project

Get ready!

Ship Requirements Checklist

Link Verification

Link Status:

Timeline

Add a Comment

Add a Comment

Add a Comment

Add a Comment

Add a Comment

Add a Comment

Add a Comment

Add a Comment

Add a Comment

Add a Comment

Add a Comment

Add a Comment

Add a Comment

Add a Comment

Add a Comment

README