phpBB stack trace/MySQL error when posting non-BMP chars (e.g. emoji)

User avatar
constexpr

10 Jul 2018, 16:15

Not sure if this is known, the couple of related threads I found were strictly about PMs, but this happens everywhere.
I don't think stack traces should be showing on a production server (though that may be a thing with phpBB, I wouldn't be surprised).

I tried submitting a post with an emoji in it, and this happened:
Image
It's not restricted to just emoji, though: any non-BMP (basic multilingual plane) character seems to cause the issue (e.g. some CJK or Indic characters, some arrow symbols etc.).

The cause is most likely that the underlying MySQL DB uses the `utf8` charset instead of `utf8mb4`. This is a typical issue with that kind of setup. Data should probably be migrated to the latter charset at some point. The migration is generally not very difficult, but I'm not familiar with the specifics here at DT, so I can't say for sure. It should be done nonetheless, though (it's 2018 and emoji are really popular).
Last edited by constexpr on 10 Jul 2018, 17:54, edited 1 time in total.

User avatar
Blaise170
ALPS キーボード

10 Jul 2018, 16:23


User avatar
constexpr

10 Jul 2018, 16:42

Yeah, that's the thread I mentioned in the first sentence. Apart from it only talking about PMs, no progress had been made there, so reviving it didn't really seem like a good idea.

User avatar
Blaise170
ALPS キーボード

10 Jul 2018, 16:58

It's an issue with phpBB itself, not the unicode implementation.

User avatar
constexpr

10 Jul 2018, 17:51

I'm pretty sure that's not the case, as what failed was an SQL DML statement executed by mysqli (PHP's MySQL API), not a PHP function in and of itself (which would indicate that phpBB was at fault). The error message states that the SQL engine has encountered an invalid string value — exactly what you would expect to see as a result of an encoding issue.

As I mentioned, this is a well known issue with software running MySQL in the back (especially software older than a couple of years), and this is exactly the way you'd expect it to manifest itself on a forum. phpBB may rely on the DB charset being utf8, but I doubt that's the case as the two are separated by an API boundary (mysqli).

Here's further reading on the difference between utf8 and utf8mb4
Here's someone who had the same issue on their phpBB board and got emoji to work by changing the charset to utf8mb4
More on modifying an existing phpBB MySQL DB

But regardless of the charset issue, I feel that PHP stack traces, i.e. debug mode should be disabled on the Deskthority production environment in any case.

Post Reply

Return to “Deskthority talk”