Lemmy newb here, not sure if this is right for this /c.

An article I found from someone who hosts their own website and micro-social network, and their experience with web-scraping robots who refuse to respect robots.txt, and how they deal with them.

  • F04118F@feddit.nl
    link
    fedilink
    English
    arrow-up
    0
    ·
    9 months ago

    Interesting approach but looks like this ultimately ends up:

    • being a lot of babysitting / manual work
    • blocking a lot of humans
    • not being robust against scrapers

    Anubis seems like a much better option, for those wanting to block bots without relying on Cloudflare:

    https://anubis.techaro.lol/

    • irotsoma@lemmy.blahaj.zone
      link
      fedilink
      English
      arrow-up
      0
      ·
      9 months ago

      Are there any guides to using it with reverse proxies like traefik? I’ve been wanting to try it out but haven’t had time to do the research yet.