String#

class String(*args, **kwargs)#

A GString is an object that handles the memory management of a C string.

The emphasis of GString is on text, typically UTF-8. Crucially, the “str” member of a GString is guaranteed to have a trailing nul character, and it is therefore always safe to call functions such as strchr() or strdup() on it.

However, a GString can also hold arbitrary binary data, because it has a “len” member, which includes any possible embedded nul characters in the data. Conceptually then, GString is like a GByteArray with the addition of many convenience methods for text, and a guaranteed nul terminator.

Constructors#

class String
classmethod new(init: str | None = None) String#

Creates a new String, initialized with the given string.

Parameters:

init – the initial text to copy into the string, or None to start with an empty string

classmethod new_len(init: str, len: int) String#

Creates a new String with len bytes of the init buffer. Because a length is provided, init need not be nul-terminated, and can contain embedded nul bytes.

Since this function does not stop at nul bytes, it is the caller’s responsibility to ensure that init has at least len addressable bytes.

Parameters:
  • init – initial contents of the string

  • len – length of init to use

classmethod new_take(init: str | None = None) String#

Creates a new String, initialized with the given string.

After this call, init belongs to the String and may no longer be modified by the caller. The memory of data has to be dynamically allocated and will eventually be freed with free().

Added in version 2.78.

Parameters:

init – initial text used as the string. Ownership of the string is transferred to the String. Passing None creates an empty string.

classmethod sized_new(dfl_size: int) String#

Creates a new String, with enough space for dfl_size bytes. This is useful if you are going to add a lot of text to the string and don’t want it to be reallocated too often.

Parameters:

dfl_size – the default size of the space allocated to hold the string

Methods#

class String
append(val: str) String#

Adds a string onto the end of a String, expanding it if necessary.

Parameters:

val – the string to append onto the end of string

append_c(c: int) String#

Adds a byte onto the end of a String, expanding it if necessary.

Parameters:

c – the byte to append onto the end of string

append_len(val: str, len: int) String#

Appends len bytes of val to string.

If len is positive, val may contain embedded nuls and need not be nul-terminated. It is the caller’s responsibility to ensure that val has at least len addressable bytes.

If len is negative, val must be nul-terminated and len is considered to request the entire string length. This makes append_len() equivalent to append().

Parameters:
  • val – bytes to append

  • len – number of bytes of val to use, or -1 for all of val

append_unichar(wc: str) String#

Converts a Unicode character into UTF-8, and appends it to the string.

Parameters:

wc – a Unicode character

append_uri_escaped(unescaped: str, reserved_chars_allowed: str, allow_utf8: bool) String#

Appends unescaped to string, escaping any characters that are reserved in URIs using URI-style escape sequences.

Added in version 2.16.

Parameters:
  • unescaped – a string

  • reserved_chars_allowed – a string of reserved characters allowed to be used, or None

  • allow_utf8 – set True if the escaped string may include UTF8 characters

ascii_down() String#

Converts all uppercase ASCII letters to lowercase ASCII letters.

ascii_up() String#

Converts all lowercase ASCII letters to uppercase ASCII letters.

assign(rval: str) String#

Copies the bytes from a string into a String, destroying any previous contents. It is rather like the standard strcpy() function, except that you do not have to worry about having enough space to copy the string.

Parameters:

rval – the string to copy into string

down() String#

Converts a String to lowercase.

Deprecated since version 2.2: This function uses the locale-specific tolower() function, which is almost never the right thing. Use ascii_down() or utf8_strdown() instead.

equal(v2: String) bool#

Compares two strings for equality, returning True if they are equal. For use with HashTable.

Parameters:

v2 – another String

erase(pos: int, len: int) String#

Removes len bytes from a String, starting at position pos. The rest of the String is shifted down to fill the gap.

Parameters:
  • pos – the position of the content to remove

  • len – the number of bytes to remove, or -1 to remove all following bytes

free(free_segment: bool) str | None#

Frees the memory allocated for the String. If free_segment is True it also frees the character data. If it’s False, the caller gains ownership of the buffer and must free it after use with free().

Instead of passing False to this function, consider using free_and_steal().

Parameters:

free_segment – if True, the actual character data is freed as well

free_to_bytes() Bytes#

Transfers ownership of the contents of string to a newly allocated Bytes. The String structure itself is deallocated, and it is therefore invalid to use string after invoking this function.

Note that while String ensures that its buffer always has a trailing nul character (not reflected in its “len”), the returned Bytes does not include this extra nul; i.e. it has length exactly equal to the “len” member.

Added in version 2.34.

hash() int#

Creates a hash code for str; for use with HashTable.

insert(pos: int, val: str) String#

Inserts a copy of a string into a String, expanding it if necessary.

Parameters:
  • pos – the position to insert the copy of the string

  • val – the string to insert

insert_c(pos: int, c: int) String#

Inserts a byte into a String, expanding it if necessary.

Parameters:
  • pos – the position to insert the byte

  • c – the byte to insert

insert_len(pos: int, val: str, len: int) String#

Inserts len bytes of val into string at pos.

If len is positive, val may contain embedded nuls and need not be nul-terminated. It is the caller’s responsibility to ensure that val has at least len addressable bytes.

If len is negative, val must be nul-terminated and len is considered to request the entire string length.

If pos is -1, bytes are inserted at the end of the string.

Parameters:
  • pos – position in string where insertion should happen, or -1 for at the end

  • val – bytes to insert

  • len – number of bytes of val to insert, or -1 for all of val

insert_unichar(pos: int, wc: str) String#

Converts a Unicode character into UTF-8, and insert it into the string at the given position.

Parameters:
  • pos – the position at which to insert character, or -1 to append at the end of the string

  • wc – a Unicode character

overwrite(pos: int, val: str) String#

Overwrites part of a string, lengthening it if necessary.

Added in version 2.14.

Parameters:
  • pos – the position at which to start overwriting

  • val – the string that will overwrite the string starting at pos

overwrite_len(pos: int, val: str, len: int) String#

Overwrites part of a string, lengthening it if necessary. This function will work with embedded nuls.

Added in version 2.14.

Parameters:
  • pos – the position at which to start overwriting

  • val – the string that will overwrite the string starting at pos

  • len – the number of bytes to write from val

prepend(val: str) String#

Adds a string on to the start of a String, expanding it if necessary.

Parameters:

val – the string to prepend on the start of string

prepend_c(c: int) String#

Adds a byte onto the start of a String, expanding it if necessary.

Parameters:

c – the byte to prepend on the start of the String

prepend_len(val: str, len: int) String#

Prepends len bytes of val to string.

If len is positive, val may contain embedded nuls and need not be nul-terminated. It is the caller’s responsibility to ensure that val has at least len addressable bytes.

If len is negative, val must be nul-terminated and len is considered to request the entire string length. This makes prepend_len() equivalent to prepend().

Parameters:
  • val – bytes to prepend

  • len – number of bytes in val to prepend, or -1 for all of val

prepend_unichar(wc: str) String#

Converts a Unicode character into UTF-8, and prepends it to the string.

Parameters:

wc – a Unicode character

replace(find: str, replace: str, limit: int) int#

Replaces the string find with the string replace in a String up to limit times. If the number of instances of find in the String is less than limit, all instances are replaced. If limit is 0, all instances of find are replaced.

If find is the empty string, since versions 2.69.1 and 2.68.4 the replacement will be inserted no more than once per possible position (beginning of string, end of string and between characters). This did not work correctly in earlier versions.

Added in version 2.68.

Parameters:
  • find – the string to find in string

  • replace – the string to insert in place of find

  • limit – the maximum instances of find to replace with replace, or 0 for no limit

set_size(len: int) String#

Sets the length of a String. If the length is less than the current length, the string will be truncated. If the length is greater than the current length, the contents of the newly added area are undefined. (However, as always, string->str[string->len] will be a nul byte.)

Parameters:

len – the new length

truncate(len: int) String#

Cuts off the end of the GString, leaving the first len bytes.

Parameters:

len – the new size of string

up() String#

Converts a String to uppercase.

Deprecated since version 2.2: This function uses the locale-specific toupper() function, which is almost never the right thing. Use ascii_up() or utf8_strup() instead.

Fields#

class String
allocated_len#

The number of bytes that can be stored in the string before it needs to be reallocated. May be larger than len.

len#

Contains the length of the string, not including the terminating nul byte.

str#

Points to the character data. It may move as text is added. The str field is null-terminated and so can be used as an ordinary C string.