Theory Part 0 - Scientific Programming

Dave White

Intro To Programming in Science

David White
davey@autistici.org
davidnwhite.com/resources

Scientific programming

Data wrangling

Data wrangling

To wrangle: Round up, herd, or take charge of.

Science

Art of wrangling patterns in nature.

…complicated.

Scientific Programming

… a way to wrangle science.

The scientific pipeline

Where could programming improve your science/ work more efficiently?

Interface

Programming is an interface to computers.

Hides some technical side of things, through abstraction.

Interface

Abstraction

low level - close to the hardware
high level - abstracted, less 'code technical'

Levels of abstraction


Scientists? Sometimes all levels…

Where are you?

99.9% on a computer.
99.9% computer programming?

Example: Scientist in Psychology

Test Experiments are designed and presented in Psychtoolbox
Measure Subject data is saved in a comma separated file (CSV)
Pre-process Subject data is verified, anonymised, and conglomerated
Analyze ANOVA analysis computed using R builtin procedures
Draw Figures are plotted using ggplot package
Publish Manuscript is written and typeset using LaTeX.
Lab-notebook Jupyter notebook

Modern scientific programming

Pushing towards fluency in several languages.

Not as scary as it sounds.
Very similar languages.

Things that transfer well between languages

Basics

MATLAB Julia R Python
i = 3 i = 3 i <- 3 i = 3
v = [9, 7, 3] v = [9, 7, 3] v <-c(9, 7, 3) v = np.aray([9 7 3])
length(v) length(v) length(v) len(v)

Fancier things

Same concepts

Procedural, Object-Oriented
Operations

Base theory

Identical!

Which Language?

What kind of interface do you want?
There is no ultimate language
Different languages for different purposes
Most languages can do anything you want,
but may not be the best choice for the problems

Brief History

Fortran 1957 Designed for scientific computing
Unix* 1973 Solidified conventions for personal computers
C 1978 Solidified conventions for programming
Matlab 1984 Designed for numeric computing
Perl 1987 Designed for general use by power-users
Bash Shell* 1989
Python 1991 Designed for learning how to program
LAPACK* 1992 SciPy, Matlab
R 1993 Designed for statisticians
Julia 2012 Designed for scientific-computing

Current state

Strong majorities with:

Julia catching up
Some people use

Comparison

C Perl python Matlab R julia
*Community 2 3 3 1 2
*Data-tables x x x
*Notebook x (x) x
*Conventions 2 3 3 1-3 3
*OOP support 1 2 3 1-2 2
*Free (Freedom) (x) x x x
Free (Beer) x x x x
compiled x
interpreted x x x (x)
Linear Algebra
Tables
Package management 2 3 3 1 3
Syntax simplicity 1 1 2 3 1
Syntax efficiency 1 3 2 2 2
Builtin-Editor x
Dynamic Interpreter (x) x x x
GPU x x x
General Scripting (x) x x (x)
documentation 2 3 3 1-3 2-3
C interface 3 ? 2-3 2 3
Live Interpreter
Style

1=worst
3 = best

Matlab fails some of the important ones

Comparison part2

non-network local connections being added to access control list
[class=firefox]
i3-msg: Could not connect to i3 on socket /run/user/1000/i3/ipc-socket.2027376: No such file or directory
i3-msg: Could not connect to i3 on socket /run/user/1000/i3/ipc-socket.2027376: No such file or directory
https://hyperpolyglot.org/numerical-analysis

Conventions for science

Very loose
Good - Freedom!
Bad - Freedom!

Subfield preference

Features

Goal

Styles

Syntax

Builtin tools

What can a language do out of the box?
What setup is required?

Libraries/Community

Library/package
A language may be technically inferior in all regards,
but may be the right choice based upon community
and available packages/libraries

Languages I use

Matlab - Psychtoolbox, sandbox
Python - Citation management
R - Regression
Stan - Advanced stats
Julia - Machine learning
C/C++ - Matlab + speed
eLisp - Text editor customization
Unix scripting (i.e. Bash, Zsh)(aka Terminal)
Perl - Text processing and database management

The Secret

No secret.

Differences are semantic
If you learn one language in and out,
it is much easier to learn another

Recommendation

  1. Get your feet wet with Octave
  2. Learn C
  3. Learn OOP