As I showed several times before, a string is a collection of characters in a specific order. You can access the individual characters of a string using indices.

String indices

Each symbol in a string has a position, this position can be referred to by the index number of the position. The index numbers start at 0 and then increase to the length of the string. The following table shows the word “python” in the first row and the indices for each letter in the second and third rows:

p y t h o n
0 1 2 3 4 5
-6 -5 -4 -3 -2 -1

As you can see, you can use positive indices, which start at the first letter of the string and increase until the end of the string is reached, or negative indices, which start with -1 for the last letter of the string and decrease until the first letter of the string is reached.

As the length of a string s is len(s), the last letter of the string has index len(s)-1. With negative indices, the first letter of the string has index -len(s).

If a string is stored in a variable, the individual letters of the string can be accessed by the variable name and the index of the requested letter between square brackets ([]) next to it.

fruit = "orange"
print( fruit[1] ) 
print( fruit[2] ) 
print( fruit[3] )
print( fruit[-2] )
print( fruit[-6] )
print( fruit[0] )
print( fruit[-3] )

You can also use variables as indices, and even calculations or function calls. You must make sure, however, that calculations result in integers, because you cannot use floats as indices. Below are some examples, most of which are so convoluted that I do not see any reason to incorporate them like this in a program. But they show what is possible.

from math import sqrt

fruit = "orange"
x = 3

print( fruit[3-2] )
print( fruit[int( sqrt( 4 ) )] )
print( fruit[2**2] )
print( fruit[int( (x-len( fruit ))/3 )] )
print( fruit[-len( fruit )])
print( fruit[-x] )

In principle, you can also use an index with the actual string rather than a variable that contains it, e.g., "orange"[2] is the letter "a". For obvious reasons no one ever does that, though.

Besides using single indices you can also access a substring (also called a “slice”) from a string by using two numbers between the square brackets with a colon (:) in between. The first of these numbers is the index where the substring starts, the second where it ends. The substring does not include the letter at the second index. By leaving out the left number you indicate that the substring starts at the beginning of the string (i.e., at index 0). By leaving out the right number you indicate that the substring ranges up to and includes the last character of the string.

If you try to access a character using an index that is beyond the reaches of a string, you get a runtime error (“index out of bounds”). For a range of indices to access substrings such limitations do not exist; you can use numbers that are outside the bounds of the string.

fruit = "orange"
print( fruit[:] )
print( fruit[0:] )
print( fruit[:6] )
print( fruit[:100] )
print( fruit[:len( fruit )] )
print( fruit[1:-1] )
print( fruit[2], fruit[1:6] )

Traversing strings

I already explained how you can traverse the characters of a string using a for loop:

fruit = 'apple'
for char in fruit:
    print( char, '- ', end='' )

Now you know about indices, you probably realize you can also use those to traverse the characters of a string:

fruit = 'apple'

for i in range( 0, len( fruit ) ):
    print( fruit[i], "- ", end="" )
print()

i = 0
while i < len( fruit ):
    print( fruit[i], "- ", end="" )
    i += 1

If you just want to traverse the individual characters of a string, the first method, using for <character> in <string>:, is by far the most elegant and readable. However, occasionally you have to solve problems in which you might prefer one of the other methods.

Write code that for a string prints the indices of all of its vowels (a, e, i, o, and u). This can be done with a for loop or a while loop, though the while loop is more suitable.

Write code that uses two strings. For each character in the first string that has exactly the same character at the same index in the second string, you print the character and the index. Watch out that you do not get an “index out of bounds” runtime error. Test it with the strings "The Holy Grail" and "Life of Brian".

Write a function that takes a string as argument, and creates a new string that is a copy of the argument, except that every non-letter is replaced by a space (e.g., "ph@t l00t" is changed to "ph t l t"). To write such a function, you will start with an empty string, and traverse the characters of the argument one by one. When you encounter a character that is acceptable, you add it to the new string. When it is not acceptable, you add a space to the new string. Note that you can check whether a character is acceptable by simple comparisons. For example, any lower case letter can be found using the test if ch >= 'a' and ch <= 'z':.

Extended slices

Slices in python can take a third argument, which is the step size (or “stride”) that is taken between indices. It is similar to the third argument for the range() function. The format for slices then becomes <string>[<begin>:<end>:<step>]. By default the step size is 1.

The most common use for the step size is to use a negative step size in order to create a reversed version of a string.

fruit = "banana"
print( fruit[::2] )
print( fruit[1::2] )
print( fruit[::-1] ) 
print( fruit[::-2] ) 

Reversing a string using [::-1] is conceptually similar to traversing the string from the last character to the beginning of the string using backward steps of size 1.

fruit = "banana"
print( fruit[::-1] )
for i in range( 5, -1, -1 ):
    print( fruit[i] )