Data

String

Python

 

Defining String

Python 2.7
str1 = "Hello World!"

Strings can simply be defined by setting a string value to a variable.

 

String Literals

Python 2.7
name1 = 'Sam'       # Single quoted
name2 = "Sam"       # Double quoted
name3 = '''Sam'''   # Triple quoted
name4 = """Sam"""   # Triple quoted

There are 4 String Literal representations:

  • Single quoted (')
  • Double quoted (")
  • Triple quoted (''' or """)
 

Triple quoted String Literal

Python 2.7
str1 = """This is a
Triple quoted string
expanded through 4
Lines."""

Triple quoted String Literals are useful when you have Newlines, Tabs or any special character in your string.

Triple quotes can be specified by 3 single or double-quotes.

Triple quoted Strings still parse escaped characters like \n.

 

Character Escape

Python 2.7
text1 = 'Sam\'s age:\t21'   # Sam's age:   21

text2 = r'Sam\'s age:\t21'  # Sam\'s age:\t21

To specify special characters or characters that make ambiguity for parsing String Literals, you need to escape them. This is a list of escaping characters:

  • \' Single quote
  • \" Double quote
  • \\ Backslash
  • \n Newline
  • \r Carriage return
  • \v Vertical tab
  • \t Tab
  • \b Backspace
  • \f Form feed
  • \a Bell
  • \xXX Latin-1 character ('\x4E' -> 'N')
  • \OOO Octal Latin-1 character ('\116' -> 'N')
  • \uXXXX 16bit Unicode character ('\u004E' -> 'N')
  • \UXXXXXXXX 32bit Unicode character ('\U0000004E' -> 'N')

Preceding String Literals with r or R, marks them as Raw Strings, meaning that all characters in those strings are Literal characters.

 

Unicode Literals

Python 2.7
# -*- coding: utf-8 -*-

utfstr1 = u"こんにちは"
utfstr2 = u"\u3053\u3093\u306B\u3061\u306F"

There are different behaviors between Python2 and Python3 when parsing strings with Unicode characters.

These are some ways to make coding with Unicode characters more consistent in Python:

  • Precede the string literals with u (for UTF-16) or U (for UTF-32).
  • Use the magic comment below, on the first or second line of your source code.
# -*- coding: utf-8 -*-
 

String Length

Python 2.7
str1 = "Hello World!"

length = len(str1)   # 12

The len() function returns the number of characters inside a string.

 

Get Character

Python 2.7
str1 = "Hello World!"

firstCh = str1[0]  # H

Strings in Python can be seen as zero-based character arrays.

You can get a character in a string at any index by specifying the index of the character inside brackets ([]).

 

Set Character

Python 2.7
str1 = "Hello World!"

str2 = str1[:1] + '3' + str1[2:]  # H3llo World!

Strings are immutable in Python, so you can only get a character on an index, and not setting it.

To replace a character in a string, you need to create a new string.

 

Concatenating Strings

Python 2.7
str1 = "Hello"
str2 = "World!"

str3 = str1 + " " + str2  # Hello World!

You can concatenate strings using the + operator.

 

Find Substring

Python 2.7
str1 = "Hello World"

index = str1.find('llo')   # 2

The find() method searches for the index of the first occurrence of some string inside another one.

If the substring does not exist, it returns -1.

 

Get Substring

Python 2.7
str1 = "Hello World!"

substr1 = str1[6:]     # World!
substr2 = str1[6:8]    # Wo

Strings can be seen as a list of immutable characters.

Substrings can be extracted in the same way sublists are extracted from lists.

This is called Slicing in Python.

The bracket syntax is like below:

[inclusive start index:exclusive end index:steps]
 

Convert to String

Python 2.7
num = 12

str1 = str(num)  # 12

The str() function converts most types to a string.