utf 8 - How to use utf8 character arrays in c++?

Question

Welcome To Ask or Share your Answers For Others

utf 8 - How to use utf8 character arrays in c++?

asked Oct 24, 2021 in Technique[技术] by 深蓝 (71.8m points)

utf 8 - How to use utf8 character arrays in c++?

Is it possible to have char *s to work with utf8 encoding in C++ (VC2010)?

For example if my source file is saved in utf8 and I write something like this:

const char* c = "a?áé??";

Is this possible to make it utf-8 encoded? And if yes, how is it possible to use

char* c2 = new char[strlen("a?áé??")];

for dynamic allocation if characters can be variable length?

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙；凝视深渊过久,深渊将回以凝视…

1 Answer

深蓝 · Answer 1 · 2021-10-23T19:27:33+0000

The encoding for narrow character string literals is implementation defined, so you'd really have to read the documentation (if you can find it). A quick experiment shows that both VC++ (VC8, anyway) and g++ (4.4.2, anyway) actually just copy the bytes from the source file; the string literal will be in whatever encoding your editor saved it in. (This is clearly in violation of the standard, but it seems to be common practice.)

C++11 has UTF-8 string literals, which would allow you to write u8"text", and be ensured that "text" was encoded in UTF-8. But I don't really expect it to work reliably: the problem is that in order to do this, the compiler has to know what encoding your source file has. In all probability, compiler writers will continue to ignore the issue, just copying the bytes from the source file, and achieve conformance simply be documenting that the source file must be in UTF-8 for these features to work.

Categories

utf 8 - How to use utf8 character arrays in c++?

utf 8 - How to use utf8 character arrays in c++?

Please log in or register to add a comment.

Please log in or register to answer this question.

1 Answer

Please log in or register to add a comment.

Just Browsing Browsing

Most popular tags