This unicode fuckery isn't just for domain names: I've had assholes list my books for sale on Amazon and Apple Books under an account that uses a unicode glyph to replace one of the letters in my name—the title shows up in search, but any payments for sales go into the grifter's account.
chaos.social/@Emathion/1146132…
chaos.social/@Emathion/1146132…
Thomas Sturm
in reply to Charlie Stross • • •Charlie Stross
in reply to Thomas Sturm • • •Thomas Sturm
in reply to Charlie Stross • • •Well, not banning, just running any vague matches of author+title involving unusual unicode ranges through a manual queue during title setup.
Amazon is a big company, pretty sure they could handle this pretty smoothly IF they wanted to.
Elena ``of Valhalla''
in reply to Charlie Stross • •@Charlie Stross @Thomas Sturm OTOH they could auto-ban names that use a suspicious mix of specific characters from different unicode blocks, as defined by the unicode consortium itself
unicode.org/reports/tr39/
(there are libraries that do all of the dirty work for you)
if they allowed for a manual override (after reasonable checks) for that one author who really wants to sell a book titled “don't go to aⅿazon.com” I'd think it would be a pretty reasonable restriction
like this
Garrett Wollman, Thomas Sturm, Laurel Stvan e 🇳🇿 :tinoflag: 💉*9 Roger like this.
Philippa Cowderoy
in reply to Charlie Stross • • •Charlie Stross
in reply to Philippa Cowderoy • • •Thomas Sturm
in reply to Charlie Stross • • •Daniel Gibson
in reply to Charlie Stross • • •Does the issue you're experiencing not risk them lawsuits?
Greg Egan
in reply to Charlie Stross • • •That’s terrible
It’s strange that Amazon allowed these accounts to sell the books at all; when I self-publish titles that either are, or have been, also published by traditional publishers, I have to jump through all kinds of hoops to prove to Amazon that I have the rights for the territory and format in question — sending them scans of my publishing contracts and letters of reversion. I’ve sometimes had to argue for *months* with multiple different Amazon employees following their opaque procedures to convince them that I’ve proved my case. So I don’t know what these grifters are doing to get their wholly fraudulent authorisation with such ease.
Charlie Stross
in reply to Greg Egan • • •Charlie Stross
Unknown parent • • •Sean
Unknown parent • • •noahm
Unknown parent • • •Andreas K
in reply to Charlie Stross • • •Well, yes.
And Unicode has a number of even more advanced fun topics. Like optional decomposition of diacritics. Yes, there are code points that serve as suffix to add all kinds of stuff to a character. So that à and ä have two Unicode representations and this also two utf-8 encodings.
MacOS FS is one place that uses decomposited Unicode. So zip files with diacritics look fine, but actually break when used say on Linux.