PM sql error on unicode character

User avatar
wheybags

25 Feb 2015, 23:32

Hey, getting the following sql error when trying to send a pm with unicode money bag in it :3

Image

User avatar
scottc

26 Feb 2015, 00:09

That's probably the most specific error report I've seen in my life

User avatar
wheybags

26 Feb 2015, 00:11

Seems too specific... Is there a debug mode in phpbb that's turned on for some reason?
Most things hide the specifics of their errors when in production.

User avatar
scottc

26 Feb 2015, 00:24

FWIW: PMs with unicode snowmen work.

User avatar
webwit
Wild Duck

26 Feb 2015, 00:49

I just learned...MySQL's utf8 implementation actually isn't utf8...
http://dev.mysql.com/doc/refman/5.5/en/ ... f8mb4.html
Some guy mentioned it here:
http://stackoverflow.com/questions/1168 ... lue-errors
("no Emoji, no astral plane, etc.")

This might be a good thing in some ways... imagine someone with emoji as a user name.

P.S. Oh I guess that means the answer to your PM would be "No" :twisted: Due to "technical reasons". :roll:

User avatar
scottc

26 Feb 2015, 00:55

Can I change mine to a unicode snowman then? :geek:

User avatar
webwit
Wild Duck

26 Feb 2015, 00:56

You'll have to ask the secretary.

Is there a unicode three byte duck?

User avatar
Daniel Beardsmore

26 Feb 2015, 00:57

The page also fails to note whether MySQL converts from decomposed to composed characters: when using VARCHAR, does the size limit cover the text's composed form? I would figure that using decomposed characters would cause a violation of the column size limit, even though, to the user, the text is within the limit … Obviously with CHAR, decomposed sequences won't fit.

User avatar
Muirium
µ

26 Feb 2015, 01:04

scottc wrote: Can I change mine to a unicode snowman then? :geek:
I will only allow emoji in usernames if the meaning of the emoji is explained — aka emojisplainin' — by the rest of the text. This is an accessibility feature, and punishment!

User avatar
scottc

26 Feb 2015, 01:06

Unicode snowman is not an emoji, it's unicode just like your µ!

http://unicodesnowmanforyou.com/

User avatar
Muirium
µ

26 Feb 2015, 01:07

My µ is authentic Greek, just the kind of thing Unicode was cobbled up for. What language of the world uses snowmen?

¿☃☃☃☃☃☃☃☃☃☃☃☃☃☃☃☃☃☃☃☃☃☃☃☃☃☃☃☃☃☃☃☃☃☃☃☃☃☃☃☃☃☃☃☃☃☃☃☃☃☃☃?

User avatar
Muirium
µ

26 Feb 2015, 01:21

webwit wrote: Is there a unicode three byte duck?
The Egyptians had several hieroglyphs for them:

Image Image

Or the genuine article:
Image

So on principle I'm all for it!

User avatar
Daniel Beardsmore

26 Feb 2015, 01:24

Muirium wrote: My µ is authentic Greek …
Ah, but you wrote U+00B5 MICRO SIGN (µ) instead of U+03BC GREEK SMALL LETTER MU (μ)!

User avatar
webwit
Wild Duck

26 Feb 2015, 01:28

Muirium wrote: My µ is authentic Greek, just the kind of thing Unicode was cobbled up for. What language of the world uses snowmen?

¿☃☃☃☃☃☃☃☃☃☃☃☃☃☃☃☃☃☃☃☃☃☃☃☃☃☃☃☃☃☃☃☃☃☃☃☃☃☃☃☃☃☃☃☃☃☃☃☃☃☃☃?
snowman.png
snowman.png (306.46 KiB) Viewed 6307 times

User avatar
Muirium
µ

26 Feb 2015, 01:30

They're so happy! Smug bastards.
Daniel Beardsmore wrote:
Muirium wrote: My µ is authentic Greek …
Ah, but you wrote U+00B5 MICRO SIGN (µ) instead of U+03BC GREEK SMALL LETTER MU (μ)!
I always type it via Option + m. Hmm… the fact these are considered distinct characters concerns me!

User avatar
webwit
Wild Duck

26 Feb 2015, 01:37

If you're a programmer and think time & timezones are a huge headache, then be very, very afraid of character encodings, which are somewhere halfway on the Scale of Evilness. The end of the scale is the Alps vortex.

User avatar
Daniel Beardsmore

26 Feb 2015, 02:02

Muirium wrote: I always type it via Option + m. Hmm… the fact these are considered distinct characters concerns me!
If asked for the uppercase version of each, what would you expect?

User avatar
Mal-2

26 Feb 2015, 05:58

Muirium wrote: They're so happy! Smug bastards.
Daniel Beardsmore wrote:
Muirium wrote: My µ is authentic Greek …
Ah, but you wrote U+00B5 MICRO SIGN (µ) instead of U+03BC GREEK SMALL LETTER MU (μ)!
I always type it via Option + m. Hmm… the fact these are considered distinct characters concerns me!
The most common technical symbols got encoded long before the rest of the Greek alphabet, and included a second time (often rendered slightly differently) when Greek got added to Unicode.

For example: Ω versus Ω, ∆ versus Δ, ∏ versus Π, ∑ versus Σ.

Also potentially confusing is £ (pound) versus ₤ (lira).

In all cases, I have one of these available in my keyboard mapping, but not both (the actual Greek, and the pound). I am neither Greek nor a mathematician, but I do write them as characters.

User avatar
Muirium
µ

27 Feb 2015, 00:14

I'm going to opt to forget all this and relish my newfound ignorance!

User avatar
Mal-2

27 Feb 2015, 04:01

Muirium wrote: I'm going to opt to forget all this and relish my newfound ignorance!
I suppose it was done because of the principle of keeping a full language set together when possible. The technical symbols are exactly that, technical symbols. Just like you wouldn't be right substituting Latin C or P for Cyrillic (lookalike) С or P, you shouldn't substitute Latin A B H for Greek Α Β Η. They may pass for the same visually, but they don't parse the same.

Of course, we now see reverse substitution problems, people using the Greek alphabet code block where they should be using a technical symbol, but that's somewhat more manageable just because of the lower volume of writing involved.

Post Reply

Return to “Deskthority talk”