Code Wrangling (python)

David White
davey@autistici.org
davidnwhite.com/resources

Editor

Python does not have its own editor
Notepadd++ is quick to install
https://notepad-plus-plus.org/downloads/

Opening Terminal

Mac
In finder open
/Applications/Utilities/Terminal

Windows
Click start. Type 'cmd' then enter.

Python Install

https://www.python.org/downloads/
OR
Via terminal (ask me!)

Windows

Windows (https://docs.python.org/3/using/windows.html)
Windows Subsystem for Linux

On Installation

Command line!

Windows
https://docs.microsoft.com/en-us/windows/wsl/install-win10

Python at a glance

Python is different

Weirdness

The good

The bad

OOP implementations

Matlab/Octave
object oriented
most everything a matrix (an object)
Matlab
object oriented
most everything an atomic vector (an object)
Python
Object oriented
Everything an object (more loose)

OOP

Pull up a terminal and type

python

python
or, if you installed ipython

ipython

python

What is OOP?

Extends the idea of typing.

Class vs Object

Class and type are loosely interchangeable terms
Class is usually associated with higher order
Object is an instance of a class

C Example

int A
int B = 8

python
are an instances of the integer type

Python example

in python…

int(8)

python
is an object of the "int" class

  1. Details

    Some higher order stuff going on under the hood

    dir(int(8))

    python

Methods

…are class specific functions

int(8).conjugate() # ending parentheses

python

In other this is valid method calling:

conjugate(int(8));

python

Attributes

…are variables tied to objects.
Usually things that are actually attributes

int(8).real # ending no parenthesis
int(8).imag # ending no parenthesis

python

Why bother?

Art of Coding

Code Wrangling
Writing complex programs that are

  1. Easy to understand/use
  2. Efficient and fast

Basics

Query objects

a=int(8)
type(a)
dir(a)  # list atrributes and methods
isinstance(a,'int')

help(a)
help(int)

python

hasattr(a,real)
getattr(a,'real')

#+BEGIN_SRC python
callable(a.conjugate)

python

Note

a.hasattr(real) # does not work

python
hasattr is a function, but not a method.
It works on more than integer types.

Builtin Types

"Primitive"

Collections

Mutability

Whether an element's assignment can be changed.
Does not mean you cannot redefine a variable (read only once created)

Treat mutable things as objects
Treat immutables as values

Mutable Types

Heap

Immutable Types

Stack

Heap

*Why?

All about memory efficient memory management.
Immutable types used as a foundation for the rest of the code base.
Immutable types limit human error, are faster.

What this means in practice

Not much you can change strings/ranges by indexing

mylist =  [1, 2, 3]
mytuple = (1, 2, 3)

# Reassignment is valid in lists...
mylist=mylist[0:2] + mylist[0:2]
print(mylist)

# and tuples...
mytuple=mytuple[0:2] + mytuple[0:2]
print(mytuple)

# Changing elements is valid in lists...
mylist[0]=0 # works
print(mylist)

# not tuples
mytuple[0]=1 # does not work

#workaround
mytuple=(0,) + mytuple[1:]
print(mytuple)

python

Collection types

Lists

ordered

mylist = ["apple", "banana", "cherry", "orange"]

mylist.append()
mylist.pop()
mylist.count()
mylist.extend()
mylist.insert()
mylist.remove()
mylist.reverse()
mylist.sort()

dir(mydict)
  1. Indexing

    1. Example 1

      Zero indexing

      mylist[0] # first element
      mylist[-1] # last element
      mylist[1:3] # 2nd to third
      mylist[:] #  all elements
      mylist[1:] #  2nd to last elements
      mylist[:-1] # 1 to second to last
    2. Example 2

      Think of indeces as places before numbers
      if no second number is include, plus one to it
      list=[A, B, C]
      list[0,1]
          [*A* B C]
      list[0,0]
          [**A B C]
      list[0,2]
          [*A B*C]

Tuples

The immutable list

mytuple = ("apple", "banana", "cherry", "orange")

Use if you have a list that doesn't change.
Faster & protected from changing

  1. Check immutability

    mytuple=mytuple[1:3] # a reassignment
    mytuple[0]="blueberry" # Doesn't work.
    
    dir(mydict)

Dictionaries

Lists with values*

mydict = {
  "apple": "yabloka",
  "banana": "banan",
  "cherry": "veeshnya",
  "orange": "apelseen",
}

mydict["apple"]
mydict[1]
mydict.keys()
mydict.values()
mydict.items()

Ranges

myrange=range(10)

for i in myrange:
    print(i)

for i in myrange:
    print(i+1)

myrange=range(1,10)
for i in myrange:
    print(i)

myrange=range(1,10,2)
for i in myrange:
    print(i)

Sets

Unordered list

myset1 = {"apple", "banana", "cherry", "orange"}
myset2 = {"apple", "banana"}
myset3 = {"apple"}
myset4 = {"apple","blueberry"}
myset5 = {"steak"}

myset4.intersection(myset4)
myset4.union(myset2)
myset4.difference(myset2)
myset4.isdisjoint(myset5)
myset4.isdisjoint(myset4)
myset4.issubset(myset1)
myset4.issuperset(myset3)

Other types

NoneType

None
x=None
print(x)

if not x:
    print(True)

if not False:
    print(True)

Slice

Fast read only indexing

A=slice(0,2)
mytuple[A]
mylist[slice(1,1)]=3 #doesn't work

python

Copy

A = [ 1, 2, 3 ]
B=a
B[0]=2
print(b)
print(a)

Details

Details

Details

Details

Details

Copy Contents not address

a = [ 1, 2, 3 ]
b=a[:]
b[0]=2
print(b)
print(a)

import copy
a = [ 1, 2, 3 ]
b=copy.copy(a)
b[0]=2

Conditonals

for i in range(10):
    if (i==0):
        pass
    elif (i==10)
        print("end")
    elif (not i % 2==0) or (i==2):
        print(str(i) + ": odd")
    else:
        print(str(i) + ": even")

python

Script object

Open a new terminal

mkdir myPython
cd myPython
touch myscript.py

python

In your editor of choice open the file you just created

def main():
    print("my first script")


if __name__ == '__main__':
    main()

python

Open a terminal

cd path/to/myscript.py
python myscript.py

python

Details

Simplification:
Your script is an instance of "file class"
name is a special attribute of your script
Its value describes how the file is called
When run directly from a terminal, as a script, (like we did) it takes on the value main
*.py files can be used for other purposes
To ensure it runs the way we want, we always include the

if __name__== '___main___':

python
The body of our script gets put in the 'main()' function.

Details

if __name__ == '__main__':
    #main()
    print(__name__)

python

Functions

def seq_even_or_odd(largest):
    for i in range(largest+2):
        if (i==0):
            pass
        elif (i==largest+1):
            print("end")
        elif (not i % 2==0) or (i==2):
            print(str(i) + ": odd")
        else:
            print(str(i) + ": even")

python

args

def my_sum(*args):
    out = 0
    for x in args:
        out += x
    return out

python

kwargs

def my_sum(**kwargs):
    kwargs.values()
    vals=list(kwargs.values())
    keys=list(kwargs.keys())
    for i in range(len(kwargs)):
        print(vals[i])
        print(keys[i])

my_sum(abc=3)

python

def my_w_comb(*args,**kwargs):
    kwargs.values()
    vals=list(kwargs.values())
    keys=list(kwargs.keys())

    wSum=1
    wProd=0

    for i in range(len(kwargs)):
        if keys[i] == "plus":
            wSum=vals[i]
        elif keys[i]=="prod":
            wProd=vals[i]
    result=0;
    for i in range(len(args)):
        result=(wSum+args[i])*(wProd)

    return result

my_w_comb(1,2,3,4, plus=3,prod=2)

python

Default values

def print_info(name, lang="python", version="3.7"):
    print("Using function " + name + " with " + lang + " version " + version)

print_info('my_w_comb')
print_info("fmincon","matlab","R2017")

python

Optional

def print_info(name, lang="python", version=None):
    if not version:
        print("Using function " + name + " with " + lang)
    else:
        print("Using function " + name + " with " + lang + " version " + version)

print_info('my_w_comb')
print_info("fmincon","matlab","R2017")

python

Classes

Creating


class experiment:
    def __init__(self,name):
        self.name=name

#-----------------------------------

def main():
    a=experiment("test")
    print(a)
    print(a.name)

#-----------------------------------
...

python

init

init is a special method called a constructor
The constructor is called whenever an instance is created
Constructors are optional.

We can call on the constructor again by

a.__init__("diffName")
print(a.name)

python
This would be mutating the same object rather than creating a new one

self

'self' is a special binding that refers to the object itself.
A function's signature is the argument types that are relevant.
init as a function must determine which instance of init it must call.
This process is called dispatch.
In python the first argument "self" defines the signature.
Because only one argument defines signatures in python, python is called a single dispatch language

Calling

a.__init__("diffname")

python
is equivalent to

__init__(a,"diffname")

python
Although python has disabled this type of syntax to keep things clear.
Thus in python, when calling an object the first argument is implied by the object itself.

methods

class experiment:
    def __init__(self,name):
        self.name=name
        self.get_date()

    def get_date(self):
        self.date=date.today()

python

class experiment:
...

files

Class definitions should go into their own file

touch __init__.py # create package
touch myClasses.py # class definitions file

in init.py

import .myClasses.myClasses

move datetime import and class to new file
save both
try running

from myClasses import experiment as exp

change experiment to exp

cmd arguments

In myScript.py

import sys

def main(name):
    if not name:
        name=None
    a=exp(name)
    print(a.name)


if __name__ == '__main__':
    if len(sys.argv) > 1:
        main(sys.argv[1])
    else:
        main()

in terminal

python myScript.py Dave

Second class

in myClasses.py

from datetime import date

class subject:
    def __init__(self,name=None,age=None,height=None,weight=None):
        if type(age)==str:
            age=float(age)

        if type(height)==str:
            height=float(height)

        if type(weight)==str:
            weight=float(weight)

        self.name=name
        self.age=age
        self.height=height
        self.weight=weight

    def get_bmi(self):
        if not self.weight or not self.height:
            return None

        return self.weight/(self.height^2) * 703

class experiment:
    def __init__(self,subject=None):
        if subject == None:
            error('no subject defined')
        self.subject=subject
        self.get_date()

    def get_date(self):
        self.date=date.today()

in myScript.py

import sys
from myClasses import experiment as exp
from myClasses import subject as subj

def main(*argv):
    mysubj=subj(*argv)
    myexp=exp(mysubj)
    print(myexp.subject.name)
    print(myexp.subject.get_bmi())


if __name__ == '__main__':
    main(*sys.argv[1:])

in Terminal

python myScript.py Dave 32 5.11 200

Cloning

import copy

def main(*argv):
    mysubj=subj(*argv)
    myexp=exp(mysubj)

    print(myexp)
    myexp2=copy.copy(myexp)
    myexp2.subject.age=20
    myexp2.subject.john=20
    print(myexp.subject.age)
    print(myexp1.subject.age)

this is actually a nice feature
Dave's parameters should be the same no matter where he is

import copy

def main(*argv):
    mysubj=subj(*argv)
    myexp=exp(mysubj)

    print(myexp)
    myexp2=copy.depcopy(myexp) # HERE
    myexp2.subject.age=20
    myexp2.subject.name='john'
    print(myexp.subject.age)
    print(myexp1.subject.age)