Are websites embassies of foreign soil inside your own hardware?

on blog at

This one is a bit roundabout but stick with me. I've exchanged a few emails with the science.org website technical support people about the RSS feed for the excellent chemistry/pharmacology blog "In the Pipeline". It's not Derek's fault at all, but his science.org/aaas hosts have basically blocked native RSS feed readers and they only allow corporate service websites that do the feed reader part for you like Feedly.org. They consider using a native application feed reader to be scraping their website and ban them.

Hello $superkuhrealname,

I wanted to follow up on your inquiry regarding RSS readers being blocked on science.org. We allow most traditional RSS readers (like Feedly) but this one in particular (QuiteRSS) we do not support. It behaves differently than most readers by using a browser to scrape content similar to a bot. We encourage you to try another RSS feed reader.

Let me know if you have any questions. Thank you.

Jessica Redacted
Publishing Platform Manager
American Association for the Advancement of Science
1200 New York Ave NW, Washington, DC 20005
jredacted@aaas.org

All QuiteRSS does is literally an HTTP HEAD or GET for the feed URL.

10.13.37.1 - - [06/Sep/2023:15:45:53 -0500] "HEAD /blog/rss.xml HTTP/1.1" 200 0 "-" "Mozilla/5.0 (Windows NT 6.1) AppleWebKit/537.21 (KHTML, like Gecko) QuiteRSS/0.18.12 Safari/537.21" 

It is the most normal of normal of RSS readers. So I'm a bit taken aback at how a professional organization can be holding such obviously ignorant and dangerous views about what an RSS feed is. I brought it up on a cyberpunk IRC channel and it was pointed out this reflects a more fundemental division in how computing is perceived these days.

this whole "scraper" equals the boogieman to people now. You're presenting data to an external client, what said client does with the data is none of your business.

You have people that saw the internet before it was commercial, or who know came later but know how the meat is made, that perceive it that way. Then you have commercial/institutional/government and people who were presented the web fait accompli who see it as a black box where interference is against the law; "interference" being a POV word choice. I don't think changing a CSS rule is interference but nowdays it'd be like vandalizing someone's building wall.

It's as if visiting a website and downloading the publicly available contents is a nation setting up an embassy of "foreign soil" on your hardware.

Their cultural expectation is that you cannot do what you want with that data. Modifying it or how it's displayed is, to them, is like walking into their business location and moving around the displays. So obviously the only legal interface is the one they provide "at their location" or via another incorporated entity they associate with. But of course they aren't at *their location* they're at my location on my property in my PC. But slowly this commercial norm is working it's way into leglistation to become our reality as web attestation.

What they see, and what they want, is a situation equal to you going to their business premise and sitting down at one of their machines. They want to own your computer in just the same way simply by you visiting a website. That shit's fucked.

Digging deeper into the situation I noticed the real problem: it's cloudflare. Of course. They're applying the cloudflare policies to the entire domain as whole and the invasive browser internals checks they have for bots are blocking everything other than major browsers and other corporations like feedly they add to whitelists. It was silly of me to expect their support email address to connect with a person who wouldn't ignorantly lie to me. The problem isn't DNS anymore. It's always cloudflare.

[comment on this post] Append "/@say/your message here" to the URL in the location bar and hit enter.

[webmention/pingback] Did you respond to this post? What's the URL?

Comments: