Raw String Literals in C++
Raw string literals are string literals that can span across multiple lines of code, that don't require escaping of embedded double quotes, and where escape sequences like \t
and \n
are not processed as escape sequences, but as normal text. For example, if you write the following with a normal string literal, you will get a compiler error because the string contains non-escaped double quotes:
string str = "Hello "World"!"; // Error!
With a normal string you have to escape the double quotes as follows:
string str = "Hello \"World\"!";
With a raw string literal you can avoid the need to escape the quotes. The raw string literal starts with R " (
and ends with ) "
.
string str = R"(Hello "World"!)";
Raw string literals can span across multiple lines. For example, if you write the following with a normal string literal, you will get a compiler error, because a normal string literal cannot span multiple lines:
string str = "Line 1 Line 2 with \t"; // Error!
Instead, you can use a raw string literal as follows:
string str = R"(Line 1 Line 2 with \t)";
This also demonstrates that with the raw string literal the \t escape character is not replaced with an actual tab character but is taken literally.
Some more examples:
- E.g., escaped characters and double quotes:
std::string noNewlines(R"(\n\n)"); std::string cmd(R"(ls /home/docs | grep ".pdf")");
- E.g., newlines:
std::string withNewlines(R"(Line 1 of the string... Line 2... Line 3)");
"Rawness" may be added to any string encoding:
LR"(Raw Wide string literal \t (without a tab))" u8R"(Raw UTF-8 string literal \n (without a newline))" uR"(Raw UTF-16 string literal \\ (with two backslashes))" UR"(Raw UTF-32 string literal \u2620 (without a code point))"
"R" must come after "u8", "u", "U", etc. – it can't come in front of those specifiers.
Extended Raw String Literal Syntax
Since the raw string literal ends with ) " you cannot embed a ) " in your string using this syntax. For example, the following string is not valid because it contains the ) " in the middle of the string:
string str = R"(The characters )" are embedded in this string)"; // ERROR!
If you need embedded ) " characters, you need to use the extended raw string literal syntax, which is as follows:
R"d-char-sequence(r-char-sequence)d-char-sequence"
The r-char-sequence is the actual raw string. The d-char-sequence is an optional delimiter sequence, which should be the same at the beginning and at the end of the raw string literal. This delimiter sequence can have at most 16 characters. You should choose this delimiter sequence as a sequence that will not appear in the middle of your raw string literal.
The previous example can be rewritten using a unique delimiter sequence as follows:
string str = R"-(The characters )" are embedded in this string)-";