String processing for embedded applications

1. Approach

For small embedded applications String processing seems to be unnecessary. The focus is on data processing, numerical values. But for parameterizing, test outputs and other using of strings is necessary.

More rich embedded applications may deal with Strings, for example:

A human interface, simple commands, parameter handling. Commands and parameter are strings.
Messages, errors, warnings, process messages stored in plain text, maybe outputted via string-oriented interfaces as serial communication (UART) or Ethernet protocolls.

C++ knows a lot of features for String processing, see https://en.wikipedia.org/wiki/C%2B%2B_string_handling, for example also in the QT library https://doc.qt.io/qt-5/qstring.html. But this String operations uses often dynamic memory which is not recommended for long running embedded control and has limits on simple processors.

Hence the situation is often, that the old-style C functions for String processing are used (strlen, strcpy etc. and printf). That is contained in the C and C++ standard, and seems to be the favored or proper possibility. By the way, it is often also a reason to persist with C instead C++.

The standard C string functions have some disadvantages. Some replacements exists, often compiler- or platform-specific as for example strncpy_s from Microsoft’s Visual Studio versus strncpy for supposed more safety, which is not true in my mind. Also gcc offers specific, sometimes proper solutions, which are not standard.

Often some String operations especially on small processors needs to much effort. A printf with text processing is not simple in machine code. It should not be used if not necessary.

The emC library offers own solutions for embedded control.

Because this development has also a longer history (since about 1994), it sometimes also has legacy, or also experience.

2. Definition of String capabilities in applstdef_emC.h

For the sources of src_emC some compiler switches are used which should be defined in applstdef_emC.h. This defines can be used in the user’s source too to adapt it to different platforms:

See applstdef_emC_h.html#DEF_StringCapabilities

3. Non 0-terminated StringJc

The zero-terminated string is one of the basic ideas of the C language. Given a string argument to a function is very simple. Only the pointer is necessary. The end is found though with the char '0' if the String is processed. - All other languages uses a pointer to the character array and their length.

C has a known problem with String overflow. One of the reason is: If a length is not known (the '0' should be firstly searched), then it is not able to calculate.

For example it you use the simple function strlen(text): If the text pointer referes (because of an error in a not yet ready software) a faulty location, which can be filled with for example AA AA … till the end of the memory, then this function searches a long time, maybe wrapped at address 0, maybe have a null-pointer-termination or not, searches furthermore, needs a long time, and the whole system may crash because time overflow in an interrupt.

Hence: The simple 0-terminated Strings are not proper, or unfortunately, or dangerous.

But it is sensible to use the 0-termination inside a given length. For example there is a buffer for a name of a paramter inside a greater data struct:

char nameParam[32]; float valueParam;

The name can have till 32 character. But it may be shorten internally by a \0.

4. Supplementation of unsafe basic string functions

See for example msdn…strcpy-is-unsafe-consider-using-strcpys-instead forum or msdn … strncpy_s Microsoft offers in its Visual Studio compiler instead strncpy(…) a strncpy_s(…). - Though strncpy(…) is safe (in comparision to strcpy, which is really unsafe). There are many hits in internet about that problem. The argumentation of strncpy_s(…): strncpy(…) does not guarantee a 0-terminated result string, if the src has exact the same length as the result buffer. But that is not a problem of this funtion (https://linux.die.net/man/3/strncpy). but an understanding problem of usage.

Hence, there is a lot of wild growth, emC offers some own functions in its emC/Base/StringBase.h and …c:

4.1. strnlen_emC

The strlen(…) routine of standard C is really unsafe and should never be used. The known strnlen(…) is safe. See description in https://man7.org/linux/man-pages/man3/strnlen.3.html. But this is for gcc. Visual Studio warns on using of strnlen(…). Other platforms may offer other implementations. Because the routine is real simple and can be adapted for special embedded platforms, it is offered as emC version too. If emC is in use, this function should be used.

The maximal expected length of a string should always be known. For example it is the length of a buffer containing the String. Then the 0-termination is not necessary if the string fills the whole buffer. To get compatibility with a strlen(…) in legacy sources, the length value can be set to INT_MAX. Then the behavior is the same. It may be better to use an expected value of a known memory size of a known value about the maximal used length of strings in this application.

4.2. strncmp_emC

The strcmp(…) routine of standard C is unsafe because the comparison is not terminated. The strncmp(…) see https://man7.org/linux/man-pages/man3/strncmp.3.html is safe. But this routine has the capability of improvement, return the position of the faulty character. This is often necessary or (in debug situations) nice to know.

int strncmp_emC(char const* const text1, char const* const text2, int maxNrofChars);

returns 0 if both text are equal till a found \0 or till maxnrofChars. It returns <0 or >0 in the same kind as the original strncmp(…), but the absolut value is the position of the first difference count from 1. If the first character on text1[0] is different, it return 1 or -1. Comparison of "abcde" and "abcdx" returns 5 because the 5_th character is different. The strncmp(…) original C routines guarantees only return a result >0.

4.3. strpncpy_emC

The strcpy(…) routine of standard C is unsafe and should never be used because on faulty arguments the destination is uncontrolled overridden which can destroy the whole application. A simple example shows it:

char srcBuffer[6];  //declared as not initialized
//... and forgotten to write or the expected routine was not called.
char dstBuffer[20];   //may be large enough ...?
strcpy(dstBuffer, srcBuffer); //seems to be save in the eyes of the developer

This disturbes the whole memory if the srcBuffer does not contain a '\0' character and it is not followed by any 0 in the memory. The reason is lightweigth and the result is catastrophical. Do never use the simple strcpy(…).

Another example, similar:

char srcBuffer[6] = {0};  //better, it is initialized.
strcpy(srcBuffer, "ident"); //correct. Safe. 0-terminated
srcBuffer[5] = nr;    //user expect "ident5" or such, the 0-termination is missed.
char dstBuffer[20];   //may be large enough ...?
strcpy(dstBuffer, srcBuffer); //seems to be save in the eyes of the developer

This is also disturbing, because of the missing 0-termination by a maybe thinking error on the srcBuffer. Do never use the simple strcpy(…).

The strncpy(…) is save, but not in the eyes of all developer, because it is sometimes complicated:

char dstBuffer[20];   //may be large enough ...? but not initialized
strncpy(dstBuffer, srcBuffer, sizeof(srcBuffer));

Following the problems of srcBuffer above, this reduces the number of character copied from srcBuffer to the given length of srcBuffer, which prevents the error above. But there are two further problems:

1) The content in dstBuffer may not be 0-terminated.
2) strncpy may be firstly used to prevent overriding of too many memory in the destination, not to determine the number of characters from src, or ? It is not definitely defined.

It seems to be, especially for the effect 1) Microsofts Visual Studio offers an own solution (tested on Visual Studio 2015, on an updated version at 2021-05-23):

char dst[40] = {0};  //initializes the whole content with 0
#ifdef __COMPILER_IS_MSVC__
dst[6] = 'Q';
int okMsc = strncpy_s(dst, "abcde", 5);
...

Because of the known argument meanings of strncpy this routine might copy the given five characters. But it sets an additionally terminating \0 on the 6_th position and overrides the 'Q', it copies 6 characters instead 5 as in the originally strncpy(…). In Microsoft’s thinking the 0-termination is more important than compatibility with other given standards. The thinking may be, that the length argument is the length of the source. Then Microsoft’s behavior is understandable. But the usual problem is: Overriding the destination should be prevented. Hence the length argument should be follow the situation on destination, more exact it should be regard both, regarding an exact given number of characters for not 0-terminated source strings, and the size of the destination buffer. The behavior of Microsoft’s solution is wrong in respect to the destination length.

The offered emC solution: strpncpy_emC(…):

The Version of strpncpy_emC regards all necessities to secure copy and concatenate strings with or without zero termination. See in emC/Base/StringBase_emC.h:

int strpncpy_emC(char* dst, int posDst, int zDst, char const* src, int zSrc);

dst is the destination buffer, which will be written from posDst. zDst is the length which can be filled with the characters. It determines the used and written memory. src is the source string with either the given length in zSrc or till a found '\0' character. Especially if zSrc = -1 or <0 the zSrc is used till 0-termination. But never the dst overflows because it is determined by zDst. The return value is the number of copied character without the maybe copied 0 as termination. It can be simple used to add to posDst for appending. Last not least this operation does not spend unnecessary calculation time to fill the destination with 0 till end. Usual it should be filled with an operation before, respectively one 0-terminating '\0' character is enough.

If there is a difference in arguments, especially a given src string cannot be copied as a whole to the given dst because the number of possible character to copy (zdst - posDst) is too less, then it can be seen as error (the given string cannot be copied) or as expected behavior (the number of copied strings is lesser because size of destination). If it is seen as error, then either an exception should be thrown, or any simple testable output should be given. But it is defined as behavior. The resulting string can be simple checked by debugging and the result is obviously. Hence no exception is given. It may be possible to compare the return value with a known length of source to detect a truncation:

char dstBuffer[40] = {0};
int zDst = sizeof(dstBuffer);  //it is 40
int nrofCharsCopied = strpncpy_emC(dstBuffer, 0, zDst, src, -1);
if(src[nrofCharsCopied] !=0) {
  ....

Here it is expected and tested that src is copied always till its 0-termination.

The operation is save, never an undesired memory location will be overridden. See the example for concatenation:

char dstBuffer[40] = {0};
int zDst = sizeof(dstBuffer);  //it is 40
int posDst = strpncpy_emC(dstBuffer, 0, zDst, "first string", -1);
if(posDst < zDst) { dstBuffer[posDst++] = nr; } //write a number as char between
posDst += strpncpy_emC(dstBuffer, posDst, zDst, "  ", 2); //typical append
//append a numeric value
posDst += toString_int32_emC(dstBuffer + posDst, zDst - posDst, nr2, -1);
posDst += strpncpy_emC(dstBuffer, posDst, zDst, " kg\n", -1); //typical append

This is pure C. Using C++ with overridden operators is more simple to read, but it bases on the same routines. As you see the argument posDst makes it easier for concatenation. The called routine toString_int32_emC(…) does not support that handling of posDst, instead a pointer add and a subtract is need on call, which is equivalent on execution but a little bit more complex in the source.

4.4. strnchr_emC, searchChar[Back]_emC

Adequate remarks as for strcmp etc. are valid for strchr. This function is not save, because it may crash if the source string is not found as 0-terminated by any mistake. The original routine in emC is:

int searchChar_emC ( char const* text, int zText, char cc);

The length of the text is terminated. A 0-termination is regarded if the 0 is found, but not necessary. The return value is the position of the found character, or -1 if not found. It follows known and proven Java concepts.

char const* strnchr_emC ( char const* text, int cc, int maxNrofChars)

is also supported. The difference is the return value only. It is compatible with the knwon strnchr(…) of standard-C.

searchCharBack_emC ( char const* const text, char cc, int fromIx);

is the adequate solution for searching the last occurence of cc in the given text.

4.5. searchString[Back]_emC

This is the replacement of the strstr(…) function, but with more simple usage following the Java approach in java.lang.String.indexOf(String).

int searchString_emC ( char const* text, int maxNrofChars, char const* search, int zs)

As the other replacements a 0-termination is not need. search is terminated either by a '\0' or by the given length zs. The text where the searching occurs is also determined either by a '\0' or by the given length maxNrofChars. The return value is either the found position or -1 if not found.

int searchStringBack_emC ( char const* text, int maxNrofChars, char const* search, int zs)

is the adquate function searching the last occurence of the given String. It is adequate the Java core function java.lang.String.lastIndexOf(String).

4.6. searchAnyChar[Back]_emC

This is the replacement of the strspn(…) function, but with more simple usage.

int searchAnyChar_emC ( char const* text, int maxNrofChars, char const* chars, int zs)

int searchAnyCharBack_emC ( char const* text, int maxNrofChars, char const* chars, int zs)

is the adquate function searching the last occurrence of one char of the chars in the given String.

4.7. Necessity of printf? instead toString_int32/float_emC

The printf is a core of C usage, see all "Hello world" examples. But printf is a really complex operation. If it is used, it increases the amount of memory, especially for poor embedded controller with small capabilities.

What is really necessary? Sometimes numeric values should be present as string also in poor controllers for example to output values in a UART communication. The whole capability of the C standard printf is often too much.

The simple routine

int toString_int32_emC(char* buffer, int zBuffer, int32 value
  , int radix, int minNrofCharsAndNegative);

offers a conversion from numeric value to string in decimal or hexadecimal presentation. It prevents usage of division because the division may be a more expensive operation in calculation time on some processors.

This solution is enqueued in the solution of string concatenations. It prevents the necessity of including library functions for printf which are often more extensive.

4.8. Scanning of numeric values: parse[Int|Float]Radix_emC

This is the counterpart of scanf approaches, more simple as the original and standard scanf(…).

int parseIntRadix_emC ( const char* src, int zSrc, int radix, int* parsedChars
                      , char const* addChars);

It returns the number. This function is a jack of all, respectively the necessities, though it is short in code (200 Bytes on a TI320 controller) and short in execution. addChars controls the abilities:

The parsedChars is the second output of the function via the given integer pointer for the parsed number of characters for the digit. It is usual necessary in the parsing process, but can be omitted too if not necessary, by given a null pointer.
If radix = 10 it parses decimal, radix=16 or =8 hexa or octal, or all other radix. An abbreviated radix, not 10 or 16 may be unnecessary, but the number is a simple number for calculation and comparison.
The hexa digits are 'a'..'f' or ’A'..'F', and till z for the radix 26 (if desired).
If addChars =null or empty, it parses only the digits.
A space as first char in addChars =" " forces skipping over leading spaces and tabs, a "\n" on first position forces skipping over all leading white spaces.
A minus addChars ="-" on the first position or on second position after addChars =" -" or ="\n-" accepts a ’-' for a negative number, a plus on that positions addChars =" +" accepts also a '+' character (positive value) which is generally unnecessary but it is hence possible.
Space or newline after the minus or plus addChars ="- " accepts spaces or tabs after the sign till start of the number, same for addChars ="+ ". The combination addChars ="\n+ " accepts white spaces before the number and spaces or tabs between sign and number.
A 'x' after this designations or on start addChars ="x" accepts switching to hexa numbers on a given 0x designation in the number. It means addChars ="\n+ x" accepts whithe spaces before sign, the sign, spaces and tabs after the sign, and then a 0x to detect a hexa number.
All other characters in addChars =" '| ," are characters which can be placed between the digits. Especially the space to group digits in one number may be sensitive for some applications. A space to separate should not written on the first positions see above, but on following positions:
- addChars =" " accepts leading spaces and tabs, and then spaces (not tabs) between the digits, two spaces are given.
- addChars ="1 " does not accept leading spaces (the space is not left) but spaces as separator inside the digit. The "1" has no meaning, it cannot be a separator, it is a digit. It is a little helper to have the space not left.
- addChars =" x " accepts for example ` 0x 34 56 ca FE` as hexa number.
- addChars =" x" accepts only ` 0x3456caFE`, not with space between the digits, but
- addChars =" x'" accepts also ` 0x3456’caFE` for better readablity.

There may be some combinations which are not sensitive, but formally possible. The operation is not intend as a sharp parser, it is intend to read numbers in a more free style, for example in a parameter list:

  param1 = + 456.45
  param2 = 0x100
  key = 234 356 357
  key2 = 0xcafe'affe

The numbers are parsed in any case after the '=', all this forms are accepted with addChars =" + x '". The white space form with "\n" is not necessary in most cases but possible if a strong whitespace thinking is given. The floating number is parsed with

float parseFloat_emC ( const char* src, int size, int* parsedChars);

which uses the parseIntRadix_emC(…) for the integer and fractional part and for the exponent.

5. StringJc as unique string representation with capabilities

The struct StringJc in emC presents a String with its length maybe in several forms (depending of the compiler switches 'DEF_NO_StringUSAGE' and 'DEF_CharSeqJcCapabilities', see #applstdef_emC_h.html#DEF_StringCapabilities.

5.1. StringJc definition itself

It is a data struct consisting of two values, a pointer to the string itself and an integer value containing the length and some status bits:

One of the basic ideas in the development was: It should be able to return by value or also used for call by value:

StringJc myStringOperation ( StringJc inp ) {
  StringJc ret = ... //build the String maybe from inp
  return ret;
}

This approaches are used also for the type StringJc. But depending on more capabilities the address is a union:

⇒ types_def_common.h: struct StringJc definition

It is a little bit complicated because conditional compiling, but it is consequently:

If DEF_NO_StringUSAGE is defined, a StringJc can only be a simple const character sequence. It can be 0-terminated or not. Often it is 0-terminated, but the length is given in the StringJc instance.
If DEF_NO_StringUSAGE is not defined, the String can also be stored in a StringBuilderJc_s instance. This is defined by status bits in val, it has a constant value of 0x3ffe, see also next chapter for mLength_StringJc.
If DEF_CharSeqJcCapabilities is defined, a String can be gotten from an ObjectJc instance which implements the CharSeq interface. This is similar in Java. See chapter Vtbl_CharSeqJc: CharSequence in C language.
If C++ usage is active, the CharSeq can also be implement by virtual operations of the named class.

It means, a StringJc can be:

a simple C-String, maybe 0-terminated or not, as const string, unmutual as in Java.
a reference to a buffer to prepare a String, see chapter StringBuilderJc: Buffer to prepare Strings. Then it is a mutual String.
a reference to any Object, which has an operations due to Vtbl_CharSeqJc, see #Vtbl_CharSeqJc
a reference to any C++ instance of type CharSequenceJcpp which is an interface to the C++ CharSequence with virtual operations.

This offers some capabilities of String processing which are more affine to Java language then to C/++ standards. Especially the simple form only using the str element of this union is very simple also proper for small, poor processors. It is the pointer to a string, not necessary zero-terminated, and the length is given in val.

5.2. Meaning of the val, mLength_StringJc

The val of this struct contains not only the length of the String but also some control bits. For simple usage 16 bit are sufficient, for more capability 32 bit for this val value are necessary:

The basic definition to evaluate this val is

//in applstdef_emC.h or in compl_adaption.h only if necessary:
#define mLength_StringJc  0x00003fff

This is the standard definition, also established in emC/Base/StringBase_emC.h;

#ifndef mLength_StringJc
#define mLength_StringJc 0x00003fff
#endif

Depending on this value some other definitions are contains in emC/Base/StringBase_emC.h:

#ifndef DEF_CharSeqJcCapabilities
  #define mVtbl_CharSeqJc 0
  #define kIsCharSeqJc_CharSeqJc 0
  #define kMaxNrofChars_StringJc (mLength_StringJc -2)
  #define mIsCharSeqJcVtbl_CharSeqJc 0
#else
  #define mVtbl_CharSeqJc (mLength_StringJc >>2)
  #define kIsCharSeqJc_CharSeqJc (mLength_StringJc -2)
  #define kMaxNrofChars_StringJc ((mLength_StringJc & ~mVtbl_CharSeqJc)-1)
  #define mIsCharSeqJcVtbl_CharSeqJc (mLength_StringJc & ~mVtbl_CharSeqJc)
#endif
#define kIs_0_terminated_StringJc (mLength_StringJc)
#define kIsStringBuilder_CharSeqJc (mLength_StringJc -1)
#define mNonPersists__StringJc       (mLength_StringJc +1)
#define mThreadContext__StringJc     ((mNonPersists__StringJc)<<1)

This definitions based on mLength_StringJc == 0x3fff defines the following values:

Bits:

0x8000: mThreadContext__StringJc: This bit describes the location of a StringJc inside the ThreadContext. It is allocated thread local and should be deallocated after usage.
0x4000: mNonPersists__StringJc: If this bit is set, the string may be changed. It is not an unmutual String.

Value mask with mLength_StringJc == 0x3fff:

0x3fff: kIs_0_terminated_StringJc: This value means, the length of the String is not known yet, but the string is zero terminated. The length can be determined by searching the '0' character. This definition makes it easy to define a const StringJc from a literal with initializer list:
```
#define INIZ_z_StringJc(TEXT) { TEXT, kIs_0_terminated_StringJc}
```
0x3ffe: kIsStringBuilder_CharSeqJc: A reference to a StringBuilderJc is used, the string is in the buffer of the StringBuilderJc. The length is determined by the StringBuilderJc data.

If DEF_CharSeqJcCapabilities is not set, then it is more simple.

0x3ffd: kMaxNrofChars_StringJc: This is the maximal value for a length of a String. If the value mask mLength_StringJc == 0x3fff with is 0 .. kMaxNrofChars_StringJc then it is an unmutual or mutual char const* string with this given length.

With this bits designation a StringJc reference can present all of this named strings. The simple case is always possible, the unmutual char const* simple String.

If DEF_CharSeqJcCapabilities is given, then the StringJc can refer to an ObjectJc instance which implements the CharSeq function pointer table. The ObjectJc* pointer part of the union is used. The ObjectJc instance should offer 3 operations to get the length, any index chars and a sub sequence likewise as java.lang.CharSequence interface. But the StringJc can also be a simple unmutual char const* or a StringBuilderJc instance, of course.

3ffd: kIsCharSeqJc_CharSeqJc If this value is set masked with mLength_StringJc, then the reference refers to an ObjectJc instance which should contain a function pointer table to CharSeqJc routines. The function pointer table is gotten from the instance by calling getVtbl_ObjectJc(obj, sign_Vtbl_CharSeqJc). This searches the function pointer table inside the ClassJc type information.
0x3000: mIsCharSeqJcVtbl_CharSeqJc; If the val mask with this bits is set, then the val & mLength_StringJc is in range 0x3000..0x3ffc. The val && mVtbl_CharSeqJc is the index to a function pointer table which is used to implement dynamic call on runtime in C language or in C++ without using virtual. The advantage for such a StringJc is: The index is already built (elsewhere int getPosInVtbl_ObjectJc(othiz, sign_Vtbl_CharSeqJc); should be called which needs a little effort. This variant should only be used for local values (hold in stack) which are more safe then anywhere in data memory. Elsewhere there can be the same problems as using the virtual mechanism in C++: Disturbed data can force a crash of programm execution.
0x0fff: mVtbl_CharSeqJc This is the mask for the posInVtbl for a CharSeqJc Object. See examples and chapter…TODO
0x2fff: kMaxNrofChars_StringJc: 0x2fff If the (val & mLength_StringJc) ⇐ kMaxNrofChars_StringJc then it is a char const* unmutable String with this given length.

This definitions should not be seen as too complicated and sophisticated. Really it is simple:

Often the simple form of StringJc is used only (const char*).
If necessary the String can be more complex.
Some access operations (next chapter) know this bits and use it.

5.3. Build operations for StringJc in in emC/Base/StringBase_emC.h

This operations can be used though DEF_NO_StringUSAGE is set. Some operations are already contained in types_def_common.h, hence they are defined without including emC/Base/StringBase_emC.h. This definitions are used also in other basic features, especially exception handling:

⇒ types_def_common.h: INIZ.., null_StringJc, empty_StringJc

The application of the INIZ macros is simple. They are macros because the definition should be done already on const definition level of the compiler. Note: We have not or won’t be use nice automation as given in C++ with initialization code which runs before main or / and usage of heap memory.

StringJc myConstString1 = INIZ_text_StringJc("content of this string1"};

This above defines two constant words in the const memory area with the reference to the constant string literal and the length of the string literal. It is a const memory area definition. C++ experts for PC programming may not know what is meant, but embedded C developers knows it.

Do not write:

StringJc myConstString2 = INIZ_text_StringJc(refToAString);

Unfortunately the C and also C++ compiler don’t distinguish between a String literal which is formally a char const*, a pointer, and the really pointer to any other String. The length is faulty calculated with the sizeof(refToAString), here it’s the number of memory words of the pointer. The length of the referenced String cannot be calculated.

The macro

StringJc myString3 = INIZ_StringJc(refToAString, itsLength;

can be used especially if a StringJc instance should be created and initialized in the stack, and the pointer and the length to any other given character array is known in variables.

5.4. Access functions in emC/Base/StringBase_emC.h

This access functions can be used also if DEF_NO_StringUSAGE is set. Adequate the possibilities are reduced, but the operations are offered.

5.4.1. getCharsAndLength_StringJc

This operations can be used though DEF_NO_StringUSAGE is set.

=>source: src_emC/emC/Base/StringBase_emC.h[tag=getCharsAndLength_StringJc]
/* Gets the char-pointer and the number of chars stored in a StringJc.
 * This is a common way to get a the content of a StringJc-Instance if a char-pointer is necessary.
 * The char-pointer is either the immediately reference as stored in the StringJc,
 * or it is a pointer to the content of a referenced StringBuilderJc.  
 * The referenced character array may be persistent or not.
 * In generally, the user should not store and use the return value persistently outside of the StringJc-concept.
 * The routine is given to support evaluating the chars immediately in a C-like-way.
 * @param length destination variable (reference) to store the realy length. 
 *   The destination variable will be set anytime.
 *   If it is -1 then the StringJc is a CharSequence interface. 
 * @return the pointer to the chars. It may be null if the StringJc contains a null-reference.
 *   Or it is a CharSequence interface
 */
extern_C char const* getCharsAndLength_StringJc ( StringJc const* thiz, int* length);

offers a pointer to the char sequence and the length information for both case, an immediately referenced character sequence or a character sequence in a StringBuilderJc.

For usage it is recommended to use this reference immediately for example to copy the content or as argument for a processing function which does not store the reference.

Storing the reference is a prone of error because it is not asserted that the character sequence is persistent. You should use the StringJc instance itself to use as reference.

6. StringBuilderJc: Buffer to prepare Strings

TODO

7. Vtbl_CharSeqJc: CharSequence in C language

TODO