I’ve spent the last 3 months building a crawler to index the public parts of Telegram (<a href="https://telehunt.org" rel="nofollow">https://telehunt.org</a>). The native search is essentially a black box that favors the top 0.1% of bot almost invisible.
The Tech: I had to deal with rate limits and the lack of a global 'sitemap'. I’m currently using a hybrid approach of metadata scraping to keep the index fresh.
The Goal: It’s an experiment in making 'un-indexable' bot data discoverable.
Where is the search engine? The site says that it's a bot directory.
wikipedia "A search engine is a software system that provides hyperlinks to web pages, and other relevant information on the Web in response to a user's query".<p>I think there can be different expectation connected to this term. It seems to be a "search engine" for bots. Bot directory does not have to have "search" functionality, right?
You may be overestimating the number of bots that meaningfully exist. The vast majority of bots (and public channels) on the platform are nonfunctional and/or spam.
It's all about Bot directories... (((
This is cool. Telegram also has a Premium feature which crawls the contents of (presumably) all public channels on the platform. It's limited to 10 searches per day and doesn't search for old content if there are too many retrieved posts.
- "I built a search engine" sounds cool on hacker news, but in reality it is a "company product", right?<p>- do the links in the footer work? I tried clicking on github icon, and it appears to be broken
what do you verify about the bots?
[dead]