Lecture 8

Today:

  • More about lists.
  • List comprehension
  • Example use of lists to keep track of things
  • Mutable vs immutable objects in Python

Last time, we taked about how to make and use lists. Some highlights from last time.

In [1]:
# defining lists:
xs = [1,2,4,1,5,"hello"]
print(xs)
[1, 2, 4, 1, 5, 'hello']
In [2]:
# changing elements:
xs = [1,2,4,1,5,"hello"]
xs[0] = 999
print(xs)
[999, 2, 4, 1, 5, 'hello']
In [3]:
# acccessing elements
print("the first element of the list is:", xs[0])
print("the last element of the list is:", xs[-1])
the first element of the list is: 999
the last element of the list is: hello
In [4]:
# length of list
print("the length of the list is:", len(xs))
the length of the list is: 6
In [5]:
# adding a new element to a list:
xs.append("goodbye")
print(xs)
[999, 2, 4, 1, 5, 'hello', 'goodbye']
In [6]:
# concatenating lists using +
xs = [1,2,3,4]
print(xs + ["hello", "goodbye"])
print("but xs is still:", xs)
[1, 2, 3, 4, 'hello', 'goodbye']
but xs is still: [1, 2, 3, 4]
In [7]:
# We could do:
xs = xs + ["hello", "goodbye"]
print(xs)
[1, 2, 3, 4, 'hello', 'goodbye']
In [8]:
# alternatively, we can use extend
xs = [1,2,3,4]
xs.extend(["hello", "goodbye"])
print(xs)
[1, 2, 3, 4, 'hello', 'goodbye']
In [9]:
# range
list(range(10))
Out[9]:
[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
In [10]:
list(range(5,10))
Out[10]:
[5, 6, 7, 8, 9]
In [11]:
list(range(5, 20, 3))
Out[11]:
[5, 8, 11, 14, 17]
In [12]:
# looping over a list using for
for i in range(10):
    print(i)
0
1
2
3
4
5
6
7
8
9

List comprehension

In math, we use the following notation a lot when defining sets: $$ X = \{ 2n \,\, | \,\, n \in \mathbb{Z} \}$$ $X$ would be the set of even numbers. Similarly, we can define lists from other lists in Python.

In the tradition of naming things with words that ordinary people can't understand, this is called list comprehension.

In [13]:
evens = [2*n for n in range(10)]
print(evens)
[0, 2, 4, 6, 8, 10, 12, 14, 16, 18]
In [14]:
odds = [n+1 for n in evens]
print(odds)
[1, 3, 5, 7, 9, 11, 13, 15, 17, 19]
In [15]:
[ 2**(2**n) for n in range(10) ]
Out[15]:
[2,
 4,
 16,
 256,
 65536,
 4294967296,
 18446744073709551616,
 340282366920938463463374607431768211456,
 115792089237316195423570985008687907853269984665640564039457584007913129639936,
 13407807929942597099574024998205846127479365820592393377723561443721764030073546976801874298166903427690031858186486050853753882811946569946433649006084096]


Example of using lists to remember things

We didn't do this in lecture. But you can read it to see an example of using lists in a real problem.

One great reason to use lists is to have lots of variables to keep track of things.

We are looking for the numbers with the longest sequences in the $3n+1$ problem.

Recall that the $3n+1$ problem works as follows:

  • Start with $a_0 = k$
  • if $a_i$ is odd, let $a_{i+1} = 3a_i + 1$
  • if $a_i$ is even, let $a_{i+1} = a_i/2$

If we start with $a_0 = 3$, the sequences is: $3, 10, 5, 16, 8, 4, 2, 1$

If we start with $a_0 = 9$, the sequence is: $9, 28, 14, 7, 22, 11, 34, 17, 52, 26, 13, 40, 20, 10 5, 16, 8, 4, 2, 1$

The funny thing that nobody understands why is that the sequence seems to always ends with 1. This is sometimes called the Collatz conjecture.

Let's find the number $< 1000000$ with the longest sequence

In [16]:
# let's generate the list
def collist(a):
    lst = []
    while a!=1:
        if a%2 == 0:
            a = a // 2
        else:
            a = 3*a + 1
        lst.append(a)
    return lst
In [17]:
collist(9)
Out[17]:
[28, 14, 7, 22, 11, 34, 17, 52, 26, 13, 40, 20, 10, 5, 16, 8, 4, 2, 1]
In [18]:
# let's generate just the length:
def collen(a):
    return len(collist(a))
In [19]:
collen(9)
Out[19]:
19

But note that we are making a whole list which is stored in memory just to compute its length. It makes no sense. We can just compute the length without making the whole list.

In [20]:
# let's generate the list
def collen(a):
    total_len = 0
    if a < 1:
        return 0
    while a!=1:
        if a%2 == 0:
            a = a // 2
        else:
            a = 3*a + 1
        total_len += 1
    return total_len
In [21]:
collen(9)
# still works:
Out[21]:
19

Let's look for the maximum length up to 1000

In [22]:
mm = 1000

max_len = -1
max_len_came_from = -1

for i in range(mm):
    le = collen(i)
    if le > max_len:
        max_len = le
        max_len_came_from = i
        
print("maximum sequence length was for the number", max_len_came_from, "and it was", max_len) 
maximum sequence length was for the number 871 and it was 178


Can we be more efficient?

Yes, in fact we keep doing the same things over and over again. For example, when computing collen(5), we see 5,16,8,4,2,1, but then we we look at collen(6), we look at 6,3,10,5,16,8,4,2,1. Redundant!!! SAD!

Solution: let's remember the lengths we already computed.

In [23]:
# start with list full of 0s
import time
mm = 1000000
llen = 10*mm
lengths = [0 for i in range(llen)]

def smart_collen(a):
    total_len = 0
    if a < 1:
        return 0
    while a!=1:
        if a < llen and lengths[a] != 0:
            return lengths[a]+total_len
        #else
        if a%2 == 0:
            a = a // 2
        else:
            a = 3*a + 1
        total_len += 1
    return total_len

max_len = -1
max_len_came_from = -1
for i in range(mm):
    le = smart_collen(i)
    if le > max_len:
        max_len = le
        max_len_came_from = i

print("maximum sequence length was for the number", max_len_came_from, "and it was", max_len)  
maximum sequence length was for the number 837799 and it was 524
In [24]:
collist(837799)
Out[24]:
[2513398,
 1256699,
 3770098,
 1885049,
 5655148,
 2827574,
 1413787,
 4241362,
 2120681,
 6362044,
 3181022,
 1590511,
 4771534,
 2385767,
 7157302,
 3578651,
 10735954,
 5367977,
 16103932,
 8051966,
 4025983,
 12077950,
 6038975,
 18116926,
 9058463,
 27175390,
 13587695,
 40763086,
 20381543,
 61144630,
 30572315,
 91716946,
 45858473,
 137575420,
 68787710,
 34393855,
 103181566,
 51590783,
 154772350,
 77386175,
 232158526,
 116079263,
 348237790,
 174118895,
 522356686,
 261178343,
 783535030,
 391767515,
 1175302546,
 587651273,
 1762953820,
 881476910,
 440738455,
 1322215366,
 661107683,
 1983323050,
 991661525,
 2974984576,
 1487492288,
 743746144,
 371873072,
 185936536,
 92968268,
 46484134,
 23242067,
 69726202,
 34863101,
 104589304,
 52294652,
 26147326,
 13073663,
 39220990,
 19610495,
 58831486,
 29415743,
 88247230,
 44123615,
 132370846,
 66185423,
 198556270,
 99278135,
 297834406,
 148917203,
 446751610,
 223375805,
 670127416,
 335063708,
 167531854,
 83765927,
 251297782,
 125648891,
 376946674,
 188473337,
 565420012,
 282710006,
 141355003,
 424065010,
 212032505,
 636097516,
 318048758,
 159024379,
 477073138,
 238536569,
 715609708,
 357804854,
 178902427,
 536707282,
 268353641,
 805060924,
 402530462,
 201265231,
 603795694,
 301897847,
 905693542,
 452846771,
 1358540314,
 679270157,
 2037810472,
 1018905236,
 509452618,
 254726309,
 764178928,
 382089464,
 191044732,
 95522366,
 47761183,
 143283550,
 71641775,
 214925326,
 107462663,
 322387990,
 161193995,
 483581986,
 241790993,
 725372980,
 362686490,
 181343245,
 544029736,
 272014868,
 136007434,
 68003717,
 204011152,
 102005576,
 51002788,
 25501394,
 12750697,
 38252092,
 19126046,
 9563023,
 28689070,
 14344535,
 43033606,
 21516803,
 64550410,
 32275205,
 96825616,
 48412808,
 24206404,
 12103202,
 6051601,
 18154804,
 9077402,
 4538701,
 13616104,
 6808052,
 3404026,
 1702013,
 5106040,
 2553020,
 1276510,
 638255,
 1914766,
 957383,
 2872150,
 1436075,
 4308226,
 2154113,
 6462340,
 3231170,
 1615585,
 4846756,
 2423378,
 1211689,
 3635068,
 1817534,
 908767,
 2726302,
 1363151,
 4089454,
 2044727,
 6134182,
 3067091,
 9201274,
 4600637,
 13801912,
 6900956,
 3450478,
 1725239,
 5175718,
 2587859,
 7763578,
 3881789,
 11645368,
 5822684,
 2911342,
 1455671,
 4367014,
 2183507,
 6550522,
 3275261,
 9825784,
 4912892,
 2456446,
 1228223,
 3684670,
 1842335,
 5527006,
 2763503,
 8290510,
 4145255,
 12435766,
 6217883,
 18653650,
 9326825,
 27980476,
 13990238,
 6995119,
 20985358,
 10492679,
 31478038,
 15739019,
 47217058,
 23608529,
 70825588,
 35412794,
 17706397,
 53119192,
 26559596,
 13279798,
 6639899,
 19919698,
 9959849,
 29879548,
 14939774,
 7469887,
 22409662,
 11204831,
 33614494,
 16807247,
 50421742,
 25210871,
 75632614,
 37816307,
 113448922,
 56724461,
 170173384,
 85086692,
 42543346,
 21271673,
 63815020,
 31907510,
 15953755,
 47861266,
 23930633,
 71791900,
 35895950,
 17947975,
 53843926,
 26921963,
 80765890,
 40382945,
 121148836,
 60574418,
 30287209,
 90861628,
 45430814,
 22715407,
 68146222,
 34073111,
 102219334,
 51109667,
 153329002,
 76664501,
 229993504,
 114996752,
 57498376,
 28749188,
 14374594,
 7187297,
 21561892,
 10780946,
 5390473,
 16171420,
 8085710,
 4042855,
 12128566,
 6064283,
 18192850,
 9096425,
 27289276,
 13644638,
 6822319,
 20466958,
 10233479,
 30700438,
 15350219,
 46050658,
 23025329,
 69075988,
 34537994,
 17268997,
 51806992,
 25903496,
 12951748,
 6475874,
 3237937,
 9713812,
 4856906,
 2428453,
 7285360,
 3642680,
 1821340,
 910670,
 455335,
 1366006,
 683003,
 2049010,
 1024505,
 3073516,
 1536758,
 768379,
 2305138,
 1152569,
 3457708,
 1728854,
 864427,
 2593282,
 1296641,
 3889924,
 1944962,
 972481,
 2917444,
 1458722,
 729361,
 2188084,
 1094042,
 547021,
 1641064,
 820532,
 410266,
 205133,
 615400,
 307700,
 153850,
 76925,
 230776,
 115388,
 57694,
 28847,
 86542,
 43271,
 129814,
 64907,
 194722,
 97361,
 292084,
 146042,
 73021,
 219064,
 109532,
 54766,
 27383,
 82150,
 41075,
 123226,
 61613,
 184840,
 92420,
 46210,
 23105,
 69316,
 34658,
 17329,
 51988,
 25994,
 12997,
 38992,
 19496,
 9748,
 4874,
 2437,
 7312,
 3656,
 1828,
 914,
 457,
 1372,
 686,
 343,
 1030,
 515,
 1546,
 773,
 2320,
 1160,
 580,
 290,
 145,
 436,
 218,
 109,
 328,
 164,
 82,
 41,
 124,
 62,
 31,
 94,
 47,
 142,
 71,
 214,
 107,
 322,
 161,
 484,
 242,
 121,
 364,
 182,
 91,
 274,
 137,
 412,
 206,
 103,
 310,
 155,
 466,
 233,
 700,
 350,
 175,
 526,
 263,
 790,
 395,
 1186,
 593,
 1780,
 890,
 445,
 1336,
 668,
 334,
 167,
 502,
 251,
 754,
 377,
 1132,
 566,
 283,
 850,
 425,
 1276,
 638,
 319,
 958,
 479,
 1438,
 719,
 2158,
 1079,
 3238,
 1619,
 4858,
 2429,
 7288,
 3644,
 1822,
 911,
 2734,
 1367,
 4102,
 2051,
 6154,
 3077,
 9232,
 4616,
 2308,
 1154,
 577,
 1732,
 866,
 433,
 1300,
 650,
 325,
 976,
 488,
 244,
 122,
 61,
 184,
 92,
 46,
 23,
 70,
 35,
 106,
 53,
 160,
 80,
 40,
 20,
 10,
 5,
 16,
 8,
 4,
 2,
 1]

Mutable vs Immutable types in Python

This is something technical about how Python works but is very important.

In [25]:
# guess what will happen
x = 1
print(x)
1
In [26]:
# guess what will happen
y = x
x += 1
print(y, x)
1 2

Even though we said y=x, and then changed x, the value of y didn't change. Why?

It was because when we said y=x, it was the value of y that got get to value of x.

In [27]:
# Let's try the same with lists
In [28]:
x = [1,2,3,4]
y = x
x[0] = 999
print(y)
[999, 2, 3, 4]

What is going on? We saw one behaviour for numbers, and a totally different behaviour for lists.

This is because numbers are immutable whereas lists are mutable.

If x refers to a mutable object, then, when we say y=x, y is referring to the same object, which exists as a separate entity to x and y. On the other hand, if x is immutable, then the value of x is just copied over to y.

More concretely, when we say x=1, in the memory of the computer, a location, which corresponds to x is created, and that location stores the value 1.

On the other hand, when we say x = [1,2,3,4], the list [1,2,3,4] is created somewhere in memory. A separate location, which corresponds to x is also created, but that location does not contain [1,2,3,4]. Instead, the location for x contains the address of the location where [1,2,3,4] is stored in memoty. We can think of x as a pointer to the actual place of [1,2,3,4] in memory.

Now, when we put x = [1,2,3,4] and then y=x, the address in memory that y refers to is changed to the address in memory that x refers to. So x and y are pointing to the same [1,2,3,4] in memory. This means that if we then change x[0], this will also change y[0].

Mutable objects as function arguments

In [29]:
# what will happen?
def f(a):
    a += a
    return a

x = 1
f(x)
Out[29]:
2
In [30]:
print(x)
1

The value of x didn't change because x was a local variable inside the function.

Let's try the same with lists:

In [31]:
# what will happen?
def f(a):
    a += a
    return a

x = [1,2,3,4]
f(x)
Out[31]:
[1, 2, 3, 4, 1, 2, 3, 4]
In [32]:
x
Out[32]:
[1, 2, 3, 4, 1, 2, 3, 4]

The value of x did change!!! This was because x was a list was mutable. Which meant that it could be changed in the function.

This can actually be a good thing: it means that we can manipulate a list inside a function. In the homework, you will be asked to write a function that will reverse a list for example. You will just call reverse(x) and the list referred to by x will be reversed.

So what if x = [1,2,3,4] and we really want a copy of [1,2,3,4] that will be different? Easy:

In [33]:
import copy
x = [1,2,3,4]
y = copy.copy(x)
print(x,y)
x[0] = 999
print(x,y)
[1, 2, 3, 4] [1, 2, 3, 4]
[999, 2, 3, 4] [1, 2, 3, 4]

As expected, y didn't change because it was a fresh copy, held at a different place in memory.

Mutable:

  • Lists
  • Tuples (later)
  • Dictionaries (later)
  • Numpy arrays (later)

Immutable:

  • Integers, floats, bools
  • Strings