Searching in lists

This chapter focuses on searching in Python lists. Python has implemented several search methods for lists, but since you might want to search in different data structures too, it is good to know how search methods may work. Note that you may assume that accessing elements of lists by their index is a very fast operation for Python, which is O(1). Why that is, will be discussed in the next chapter, in which it is relevant to understand exactly how Python has implemented lists.

The two search methods which Python has implemented for lists are:

index(), which returns the index in the list of the first occurrence of the element that is given as argument
count(), which returns how often the element that is given as argument occurs in the list

There is also the in operator (which, you may have learned from the chapter on operator overloading, is actually the __contains__() method), which returns True if an element is in the list, and False otherwise.

Linear search

If there is nothing known about the contents of the lists, the only available search method is a so-called “linear search”, i.e., processing the items all in order until the target element is found or the end of the list is reached. Python has implemented the methods discussed above as linear search methods. For clarity, I will give functions that emulate the behavior of these search methods below:

def my_index( item, tlist ):
    i = 0
    while i < len( tlist ):
        if tlist[i] == item:
            return i
        i += 1
    return -1

def my_count( item, tlist ):
    count = 0
    i = 0
    while i < len( tlist ):
        if tlist[i] == item:
            count += 1
        i += 1
    return count

def my_in( item, tlist ):
    return my_index( item, tlist ) >= 0

fruitlist = ["apple", "banana", "cherry", "apple", "apple", "peach", "cherry", "apple"]
print( "The index of banana is", my_index( "banana", fruitlist ) )
print( "The count of apple is", my_count( "apple", fruitlist ) )
if my_in( "durian", fruitlist ):
    print( "There is a durian on the list" )
else:
    print( "There is no durian on the list" )

Note that I could have implemented the my_count() function by using a for ... in ... construction. I did not do that for two reasons: (1) it would then use the reserved word in, and I am showing my own implementation of in; and (2) in practice, Python processes lists using indices, as that can be done very speedily.

Time complexity of linear search

The time complexities of the implementations I give above are as follows (you can check):

my_index() is O(n) and Ω(1)
my_count() is O(n), Ω(n), and Θ(n)
my_in() is O(n) and Ω(1)

All linear search methods are O(n). They cannot be anything else, because (in the worst-case scenario) they have to see each and every element once.

To improve upon this, either the data has to be ordered in a particular way in a list, or a data structure must be designed that supports an efficient way of ordering the data. I will discuss two common search methods which can be used with lists below. More complex data structures that support advanced searching will follow in later chapters.