Love all of these RSS resources. Thanks for sharing!
Last week, I spent a couple of hours at a local hack event putting an RSS aggregator[1] together for our community. Just something fun to do.
One thing I realized when I deployed is that Substack gives a 403 if you try to read their RSS feeds from a GitHub Action. The only obvious workaround to me is to pull the content on local periodically, commit it, and then deploy. But I'd much rather have this site updating itself via GitHub Action and cron.
Every small town deserves someone like this. And also someone moderating local Facebook groups to tamp down the scams. For every elderly victim of scams, there’s a honest hard-working young person who discovers that the corresponding economic opportunity moved elsewhere.
Isn't it time-consuming to build a scraper for every website you want to get updates from? What if the HTML is a mess and full of dynamic front-end-framework classes etc?
Once I created the structure to support lots of different kinds of scraping-to-feed conversions, it’s usually fast to add a new target site in to the mix. There are definitely exceptions, and definitely the occasional maintenance when someone updates their CSS.
Love all of these RSS resources. Thanks for sharing!
Last week, I spent a couple of hours at a local hack event putting an RSS aggregator[1] together for our community. Just something fun to do.
One thing I realized when I deployed is that Substack gives a 403 if you try to read their RSS feeds from a GitHub Action. The only obvious workaround to me is to pull the content on local periodically, commit it, and then deploy. But I'd much rather have this site updating itself via GitHub Action and cron.
Have you run into this situation before?
[1]: https://github.com/astoria-tech/subcurrent-astro/
Every small town deserves someone like this. And also someone moderating local Facebook groups to tamp down the scams. For every elderly victim of scams, there’s a honest hard-working young person who discovers that the corresponding economic opportunity moved elsewhere.
Isn't it time-consuming to build a scraper for every website you want to get updates from? What if the HTML is a mess and full of dynamic front-end-framework classes etc?
Once I created the structure to support lots of different kinds of scraping-to-feed conversions, it’s usually fast to add a new target site in to the mix. There are definitely exceptions, and definitely the occasional maintenance when someone updates their CSS.
For apps, static analysis and reverse engineering can be a good alternative (or complement) to the proxy-in-the-middle technique.