Python vs Java (Modified from http://hyperpolyglot.org/)

a side-by-side reference sheet

Indexes:    arithmetic and logic | strings | regexes | dates and time | arrays and lists | sets | dictionaries | functions | execution control | files | directories | processes and environment | libraries and modules | objects | reflection | web | tests | debugging and profiling | interop

python (1991)

java (1995)

versions used
 

2.7; 3.2

SE6; SE7

implicit prologue

import os, re, sys

none

show version
 

$ python -V

$ javac --version

interpreter/complier
 

$ python foo.py

$ javac foo.java

repl
 

$ python

$ java

command line script

$ python -c "print('hi')"

$

statement separator
 

newline or ;

newlines not separators inside (), [], {}, triple quote literals, or after backslash: \

 ;


block delimiters
 

offside rule

{}

assignment
 

assignments can be chained but otherwise don't return values:
v = 1

int v = 1;

parallel assignment
 

x, y, z = 1, 2, 3
# raises ValueError:
x, y = 1, 2, 3
# raises ValueError:
x, y, z = 1, 2

none

swap
 

x, y = y, x

none

compound assignment operators: arithmetic, string, logical, bit

# do not return values:
+= -= *= /= //= %= **=
+= *=
&= |= ^=
<<= >>= &= |= ^=

+= -= *= /=  %=
+= *=
&&= ||= ^=
<<= >>= &= |= ^=

increment and decrement
 

none

none

local variable declarations
 

# in function body:
v = None
a, d = [], {}
x = 1
y, z = 2, 3

String v = null;

int x = 1;

int y = 2;

regions which define local scope

nestable (read only):
  function or method body

top level:
  file
  class
  method body

nestable:
  anonymous class
  anonymous block

global variable

g1, g2 = 7, 8
def swap_globals():
  global g1, g2
  g1, g2 = g2, g1

int x=1;

constant declaration
 

# uppercase identifiers
# constant by convention

PI = 3.14

# warning if capitalized
# identifier is reassigned

Final Double PI = 3.14;

to-end-of-line comment
 

# comment

# comment

comment out multiple lines
 

use triple quote string literal:
'''comment line
another line'''

//
/*
 */

null
 

None

null

null test
 

v == None
v is None

v == null

undefined variable access
 

raises NameError

?

undefined test
 

not_defined = False
try: v
except NameError: not_defined = True

?

arithmetic and logic

python

java

true and false
 

True False

true false

falsehoods
 

False None 0 0.0 '' [] {}

false null

logical operators
 

and or not

&& || !

conditional expression
 

x if x > 0 else -x

x > 0 ? x : -x

comparison operators
 

comparison operators are chainable:
== != > < >= <=

== != > < >= <=

three value comparison

removed from Python 3:
cmp(0, 1)
cmp('do', 're')

none

convert from string, to string
 

7 + int('12')
73.9 + float('.037')
'value: ' + str(8)

7 + Integer.parseInt("12")
73.9 + Float.parseFloat("0.37")
"value: " + String.valueOf("8")

arithmetic operators
 

+ - * / // % **

+ - * / %

integer division and divmod
 

13 // 5
q, r = divmod(13, 5)

int quotient= 13 / 5;
int remaineder=13%5;

float division
 

float(13) / 5
# Python 3:
13 / 5

13.0 / 5 or
5/1.0

arithmetic functions
 

from math import sqrt, exp, log, \
sin, cos, tan, asin, acos, atan, atan2

import java.lang.Math;

Math.sqrt Math.exp Math.log Math.sin Math.cos Math.tan Math.asin Math.acos Math.atan Math.atan2

arithmetic truncation
 

import math

int(x)
int(round(x))
math.ceil(x)
math.floor(x)
abs(x)

import java.lang.Math;

(long)3.77
Math.round(3.77)
(long)Math.floor(3.77)
(long)Math.ceil(3.77)

min and max
 

min(1,2,3)
max(1,2,3)
min([1,2,3])
max([1,2,3])

none

division by zero
 

raises ZeroDivisionError

throws ArithmeticException

integer overflow
 

becomes arbitrary length integer of type long

?

float overflow
 

raises OverflowError

?

sqrt -2
 

# raises ValueError:
import math
math.sqrt(-2)

# returns complex float:
import cmath
cmath.sqrt(-2)

throws ArithmeticException

rational numbers
 

from fractions import Fraction

x = Fraction(22,7)
x.numerator
x.denominator

complex numbers
 

z = 1 + 1.414j
z.real
z.imag

none

random integer, uniform float, normal float

import random

random.randint(0,99)
random.random()
random.gauss(0,1)

import java.util.Random;

Random rand=new Random();

rand.nextInt(100);

rand.nextDouble();

set random seed, get and restore seed

import random

random.seed(17)
sd = random.getstate()
random.setstate(sd)

import java.util.Random;

Random rand=new Random(10);
Random rand=new Random();

rand.setSeed(20);

bit operators
 

<< >> & | ^ ~

<< >> & | ^ ~

binary, octal, and hex literals and conversions

0b101010   bin()
052        oct()
0x2a     hex()

0b101010
052
0x2a

base conversion

int("60", 7)

Integer. parseInt(42, 8);

strings

python

java

string literal
 

'don\'t say "no"'
"don't say \"no\""
"don't " 'say "no"'
'''don't say "no"'''
"""don't say "no\""""

"don't say \"no\""

newline in literal
 

triple quote literals only

no

character escapes
 

single and double quoted:
\newline \\ \' \" \a \b \f \n \r \t \v \ooo \xhh

Python 3:
\uhhhh \Uhhhhhhhh

double quoted:
\b \f \n \r \t \uhhhh \\ \" \' \o \oo \ooo

variable interpolation
 

count = 3
item = 'ball'
print('{count} {item}s'.format(
  **locals()))

int count = 3;
String item = "ball";
System.out.println(count+” ”+item);

custom delimiters

none

none

sprintf
 

'lorem %s %d %f' % ('ipsum', 13, 3.7)

fmt = 'lorem {0} {1} {2}'
fmt.format('ipsum', 13, 3.7)

none

here document
 

none

none

concatenate
 

s = 'Hello, '
s2 = s + 'World!'

juxtaposition can be used to concatenate literals:
s2 = 'Hello, ' "World!"

s = "Hello, ";
s2 = s + "World!";


replicate
 

hbar = '-' * 80

none

split, in two, with delimiters, into characters

'do re mi fa'.split()
'do re mi fa'.split(None, 1)
re.split('(\s+)', 'do re mi fa')
list('abcd')

"do re mi fa".split(" ");
"abcd".split("");

join
 

' '.join(['do', 're', 'mi', 'fa'])

none

case manipulation

'lorem'.upper()
'LOREM'.lower()
'lorem'.capitalize()

"lorem".toUpperCase();
"LOREM".toLowerCase();

strip
 

' lorem '.strip()
' lorem'.lstrip()
'lorem '.rstrip()

" lorem ".trim()

pad on right, on left
 

'lorem'.ljust(10)
'lorem'.rjust(10)

none

length
 

len('lorem')

"lorem".length()

index of substring
 

'do re re'.index('re')
'do re re'.rindex('re')
raise ValueError if not found

"do re re".indexOf("re")
"do re re".indexOf("re", 5)
return -1 if not found

extract substring
 

'lorem ipsum'[6:11]

"lorem ipsum".substring(10,13)

extract character

'lorem ipsum'[6]

"lorem ipsum".charAt(3)

chr and ord
 

chr(65)
ord('A')

char d=(char)100;
int d=(int)’d’;

character translation
 

from string import lowercase as ins
from string import maketrans

outs = ins[13:] + ins[:13]
'hello'.translate(maketrans(ins,outs))

regexes

python

java

literal, custom delimited literal

re.compile('lorem|ipsum')
none

character class abbreviations and anchors

char class abbrevs:
. \d \D \s \S \w \W

anchors: ^ $ \A \b \B \Z

match test
 

if re.search('1999', s):
  print('party!')

case insensitive match test

re.search('lorem', 'Lorem', re.I)

modifiers
 

re.I re.M re.S re.X

substitution
 

s = 'do re mi mi mi'
s = re.compile('mi').sub('ma', s)

match, prematch, postmatch
 

m = re.search('\d{4}', s)
if m:
  match = m.group()
  prematch = s[0:m.start(0)]
  postmatch = s[m.end(0):len(s)]

boolean isMatch = "hello".matches(".*ll.*");

group capture
 

rx = '(\d{4})-(\d{2})-(\d{2})'
m = re.search(rx, '2010-06-03')
yr, mo, dy = m.groups()

rx = /(\d{4})-(\d{2})-(\d{2})/
m = rx.match("2010-06-03")
yr, mo, dy = m[1..3]

scan
 

s = 'dolor sit amet'
a = re.findall('\w+', s)

a = "dolor sit amet".scan(/\w+/)

backreference in match and substitution

none

rx = re.compile('(\w+) (\w+)')
rx.sub(r'\2 \1', 'do re')

/(\w+) \1/.match("do do")

"do re".sub(/(\w+) (\w+)/, '\2 \1')

recursive regex

none

none

date/time

python

java

date/time type
 

datetime.datetime

java.util.Date

current date/time

import datetime

t = datetime.datetime.now()
utc = datetime.datetime.utcnow()

long millis = System.currentTimeMillis();
Date dt = new Date(millis);

to unix epoch, from unix epoch

from datetime import datetime as dt

epoch = int(t.strftime("%s"))
t2 = dt.fromtimestamp(1304442000)

long epoch = dt.getTimeInMillis()/1000;

Date dt2 = new Date(epoch * 1000);

current unix epoch

import datetime

t = datetime.datetime.now()
epoch = int(t.strftime("%s"))

epoch = Time.now.to_i

strftime

t.strftime('%Y-%m-%d %H:%M:%S')

DateFormat fmt = new SimpleDateFormat("yyyy-MM-dd HH:mm:ss");
String s2 = fmt.format(dt);

get HOUR_OF_DAY, DAY_OF_WEEK, etc

line=106,2007-02-20 00:05:41,121.480600,31.319600, 92, 90,1”

day_of_week = int(time.strftime("%w", time.strptime("{timestamp}".format(**line), "%Y-%m-%d %H:%M:%S")))

            hour_of_day = int(time.strftime("%H", time.strptime("{timestamp}".format(**line), "%Y-%m-%d %H:%M:%S")))

            minute = int(time.strftime("%M", time.strptime("{timestamp}".format(**line), "%Y-%m-%d %H:%M:%S")))

            sec = int(time.strftime("%S", time.strptime("{timestamp}".format(**line), "%Y-%m-%d %H:%M:%S")))

DateFormat fmt = new SimpleDateFormat(“yyyy-MM-dd HH:mm:ss”);
Date dt = fmt.format(“
2011-08-23 19:35:59.411135”);

Calendar cl=Calendar.getNewInstance();

cl.setTime(dt);

System.out.println(cl.get(Calendar.HOUR_OF_DAY));

System.out.println(cl.get(Calendar.DAY_OF_WEEK));

System.out.println(cl.get(Calendar.MINUTE));

System.out.println(cl.get(Calendar.SECOND));

System.out.println(cl.get(Calendar.MILLISECOND));

default format example

2011-08-23 19:35:59.411135

2011-08-23 17:44:53 -0700

strptime

from datetime import datetime

s = '2011-05-03 10:00:00'
fmt = '%Y-%m-%d %H:%M:%S'
t = datetime.strptime(s, fmt)

String s = "2011-05-03 17:00:00";
Date dt2 = fmt.parse(s);

parse date w/o format

# pip install python-dateutil
import dateutil.parser

s = 'July 7, 1999'
t = dateutil.parser.parse(s)

result of date subtraction

datetime.timedelta object

Float containing time difference in seconds

add time duration

import datetime

delta = datetime.timedelta(
  minutes=10,
  seconds=3)
t = datetime.datetime.now() + delta

require 'date/delta'

s = "10 min, 3 s"
delta = Date::Delta.parse(s).in_secs
t = Time.now + delta

local timezone

a datetime object has no timezone information unless a tzinfo object is provided when it is created

if no timezone is specified the local timezone is used

timezone name; offset from UTC; is daylight savings?

import time

tm = time.localtime()
  
time.tzname[tm.tm_isdst]
(time.timezone / -3600) + tm.tm_isdst
tm.tm_isdst

t.zone
t.utc_offset / 3600
t.dst?

microseconds

t.microsecond

t.usec

sleep

import time

time.sleep(0.5)

sleep(0.5)

timeout

import signal, time

class Timeout(Exception): pass

def timeout_handler(signo, fm):
  raise Timeout()

signal.signal(signal.SIGALRM,
  timeout_handler)

try:
  signal.alarm(5)
  time.sleep(10)
except Timeout:
  pass
signal.alarm(0)

require 'timeout'

begin
  Timeout.timeout(5) do
    sleep(10)
  end
rescue Timeout::Error
end

arrays and lists

python

java

literal
 

a = [1, 2, 3, 4]

int[] a = {1, 2, 3, 4};

allocate array on heap

arrays must be allocated on heap

int[] a = new int[10];

arraylist must be allocated on heap

List<T> al=new List<T>(10);

ArrayList<T> al=new ArrayList<T>(anotherAL);

quote words
 

none

none

size
 

len(a)

a.length;

al.size();

empty test
 

not a

if(a.length==0)

if(al.size()==0) or if(al.isEmpty())

lookup
 

a[0]

a[0]

al.get(0)

update
 

a[0] = 'lorem'

a[0] = "lorem";

al.set(0,”new string”);

out-of-bounds behavior

a = []
raises IndexError:
a[10]
raises IndexError:
a[10] = 'lorem'

int[] a ={1,2,3};
throws IndexOutOfBoundError
a[10]      al.get(10);
throws IndexOutOfBoundError
a[10] = "lorem" 

index of array element

a = ['x', 'y', 'z', 'w']
i = a.index('y')

none

int idx = al.indexOf(“obj”);

slice by endpoints, by length
 

select 3rd and 4th elements:
a[2:4]

none

slice to end
 

a[1:]

none

manipulate back
 

a = [6,7,8]
a.append(9)
a.pop()

none

al.add(newObj);

al.remove(obj);

al.remove(idx);

manipulate front
 

a = [6,7,8]
a.insert(0,5)
a.pop(0)

none

al.add(idx, obj);

concatenate

a = [1,2,3]
a2 = a + [4,5,6]
a.extend([4,5,6])

none

al.addAll(anotherAL);

al.addAll(idx, anotherAL);

replicate

a = [None] * 10
a = [None for i in range(0, 10)]

none

ArrayList<T> newAL=new ArrayList<T>(al);

address copy, shallow copy, deep copy

import copy

a = [1,2,[3,4]]
a2 = a
a3 = list(a)
a4 = copy.deepcopy(a)

shallow copy

ArrayList<T> newAL=new ArrayList<T>(al);

deep copy

ArrayList<T> newAL=new ArrayList<T>(al);

Collections.copy(newAL, al);

arrays as function arguments

parameter contains address copy

parameter contains shallow copy

iteration
 

for i in [1,2,3]:
  print(i)

none

for(T  ele: al)

    System.out.println(ele.toString());

indexed iteration

a = ['do', 're', 'mi', 'fa']
for i, s in enumerate(a):
  print('%s at index %d' % (s, i))

for(int i=0;i<a.length;i++)

     System.out.println(a[i]);

 

for(int i=0;i<al.size();i++)

    System.out.println(al.get(i));

iterate over range

range replaces xrange in Python 3:
for i in xrange(1, 1000001):
  code

for(int i=0;i<1000;i++){

    code

}

instantiate range as array

a = range(1, 11)
Python 3:
a = list(range(1, 11))

none

reverse

a = [1,2,3]
a[::-1]
a.reverse()

none

Collections.reverse(al);

sort

a = ['b', 'A', 'a', 'B']
sorted(a)
a.sort()
a.sort(key=str.lower)

none

 

Collections.sort(al, comparator);

dedupe

a = [1,2,2,3]
a2 = list(set(a))
a = list(set(a))

ArrayList deduped = new ArrayList(new HashSet(duplication_al));

sets

python

java

membership
 

7 in a

al.containst(obj);

intersection
 

{1,2} & {2,3,4}

Set intersec=new HashSet(set_1);

intersec.retainAll(set_2);

union
 

{1,2} | {2,3,4}

Set union=new HashSet(set_1);

union.addAll(set_2);

relative complement, symmetric difference

{1,2,3} - {2}
{1,2} ^ {2,3,4}

Set diff=new HashSet(set_1);

diff.removeAll(set_2);

pick random element

from random import shuffle, sample

a = {1, 2, 3, 4}
sample(a,1)

//s is a set

int rand_idx=rand.nextInt(s.size());

int i=0;

for(Object o: s){

   if(i==rand_idx){

         return o;

   }

   i+=1;

}

map
 

map(lambda x: x * x, [1,2,3])
# or use list comprehension:
[x*x for x in [1,2,3]]

filter
 

filter(lambda x: x > 1, [1,2,3])
# or use list comprehension:
[x for x in [1,2,3] if x > 1]

reduce
 

# import needed in Python 3 only
from functools import reduce

reduce(lambda x, y: x+y, [1,2,3], 0)

universal and existential tests
 

all(i%2 == 0 for i in [1,2,3,4])
any(i%2 == 0 for i in [1,2,3,4])

shuffle and sample

from random import shuffle, sample

a = [1, 2, 3, 4]
shuffle(a)
sample(a, 2)

zip
 

# array of 3 pairs:
a = zip([1,2,3], ['a', 'b', 'c'])

dictionaries

python

java

literal
 

d = { 't':1, 'f':0 }

none, has to allocate first

java.util.TreeMap<String, Integer> m = new java.util.TreeMap<String, Integer>();

size
 

len(d)

m.size()

lookup
 

d['t']

m.get("hello")

out-of-bounds behavior
 

d = {}
raises KeyError:
d['lorem']
adds key/value pair:
d['lorem'] = 'ipsum'


returns null:
m["lorem"]
adds key/value pair:
d["lorem"] = "ipsum"

is key present
 

'y' in d

m.containsKey(key);

delete entry

d = {1: True, 0: False}
del d[1]

m.remove(key);

from array of pairs, from even length array

a = [[1,'a'], [2,'b'], [3,'c']]
d = dict(a)

a = [1,'a',2,'b',3,'c']
d = dict(zip(a[::2], a[1::2]))

a = [[1,"a"], [2,"b"], [3,"c"]]
d = Hash[a]

a = [1,"a",2,"b",3,"c"]
d = Hash[*a]

merge

d1 = {'a':1, 'b':2}
d2 = {'b':3, 'c':4}
d1.update(d2)

m1.putAll(m2);

invert

to_num = {'t':1, 'f':0}
# dict comprehensions added in 2.7:
to_let = {v:k for k, v
  in to_num.items()}

to_num = {"t"=>1, "f"=>0}
to_let = to_num.invert

iteration
 

for k, v in d.iteritems():
  code

Python 3:
for k, v in d.items():
  code

for ( java.util.Map.Entry<String, Integer> e : m.entrySet() ) {
  use e.getKey() or e.getValue()
}

keys and values as arrays

d.keys()
d.values()

Python 3:
list(d.keys())
list(d.values())

m.keySet();
none

default value, computed value

from collections import defaultdict

counts = defaultdict(lambda: 0)
counts['foo'] += 1

class Factorial(dict):
  def __missing__(self, k):
    if k > 1:
      return k * self[k-1]
    else:
      return 1

factorial = Factorial()

counts = Hash.new(0)
counts['foo'] += 1

factorial = Hash.new do |h,k|
  k > 1 ? k * h[k-1] : 1
end

functions

python

java

function declaration
 

def add(a, b):
  return a+b

public static int add(int a, int b){

   return a+b;

}

function invocation

add(1, 2)

add(1, 2);

missing argument behavior
 

raises TypeError

throws IllegalArgumentException

default value
 

import math

def my_log(x, base=10):
  return math.log(x)/math.log(base)

my_log(42)
my_log(42, math.e)

implemented by method overloading

void my_log(int x){

    my_log(x, 2);

}

void my_log(int x, int base);

variable number of arguments

def foo(*a):
  if len(a) >= 1:
    print('first: ' + str(a[0]))
  if len(a) >= 2:
    print('last: ' + str(a[-1]))

public static String concat(String first, String… rest) {
  StringBuilder sb = new StringBuilder(first);
  for (String arg: rest) {
    sb.append(arg);
  }
  return sb.toString();
}
String s = Concat.concat("Hello", ", ", "World", "!");

named parameters
 

def fequal(x, y, **opts):
  eps = opts.get('eps') or 0.01
  return abs(x - y) < eps

fequal(1.0, 1.001)
fequal(1.0, 1.001, eps=0.1**10)

none

pass number or string by reference
 

not possible

not possible

primitive types are always passed by value

pass array or dictionary by reference
 

def foo(x, y):
  x[2] = 5
  y['f'] = -1

a = [1,2,3]
d = {'t':1, 'f':0}
foo(a, d)

objects and arrays are always passed by reference

return value
 

return arg or None

return arg or none

multiple return values
 

def first_and_second(a):
  return a[0], a[1]

x, y = first_and_second([1,2,3])

none

lambda declaration
 

body must be an expression:
sqr = lambda x: x * x

none

lambda invocation

sqr(2)

none

function reference

func = add

none

function with private state

# state not private:
def counter():
  counter.i += 1
  return counter.i

counter.i = 0
print(counter())

none

closure

# Python 3:
def make_counter():
  i = 0
  def counter():
    nonlocal i
    i += 1
    return i
  return counter

nays = make_counter()

none

generator

def make_counter():
  i = 0
  while True:
    i += 1
    yield i

nays = make_counter()
print(nays.next())

none

decorator

def logcall(f):
  def wrapper(*a, **opts):
    print('calling ' + f.__name__)
    f(*a, **opts)
    print('called ' + f.__name__)
  return wrapper

@logcall
def square(x):
  return x * x

square(5)

execution control

python

java

if
 

if 0 == n:
  print('no hits')
elif 1 == n:
  print('one hit')
else:
  print(str(n) + ' hits')

if (i>0) {
  signum = 1;
} else if (i==0) {
  signum = 0;
} else {
  signum = -1;
}

switch

none

switch(i) {
case 0:
  0;
  break;
case 1:
  1;
  break;
default:
  -1;
  break;
}

while
 

while i < 100:
  i += 1

int i = 0;
while (i<10) {

  i++;
}

c-style for
 

none

int n = 1;
for (int i=1; i<=10; i++) {
  n *= i;
}

break, continue, redo
 

break continue none

break continue

control structure keywords

elif else for if while

switch case if else for do while

what do does

raises NameError unless a value was assigned to it

do while

statement modifiers
 

none

none

raise exception
 

raise Exception('bad arg')

throw new Exception("failed");

catch exception
 

try:
  risky()
except:
  print('risky failed')

try {
  throw new Exception("failed");
} catch (Exception e) {
  System.out.println(e.getMessage());
}

global variable for last exception

last exception: sys.exc_info()[1]

define exception

class Bam(Exception):
  def __init__(self):
    super(Bam, self).__init__('bam!')

catch exception by type

try:
  raise Bam()
except Bam as e:
  print(e)

none

finally/ensure
 

acquire_resource()
try:
  risky()
finally:
  release_resource()

try {
  risky code
} finally {
  perform cleanup
}

start thread
 

class sleep10(threading.Thread):
  def run(self):
    time.sleep(10)

thr = sleep10()
thr.start()

// alternative definition: class PrimeRun extends Thread

class PrimeRun implements Runnable {

         long minPrime;

         PrimeRun(long minPrime) {

             this.minPrime = minPrime;

         }

         public void run() {

             // compute primes larger than minPrime

              . . .

         }

     }

PrimeRun p = new PrimeRun(143);

     new Thread(p).start();

wait on thread
 

thr.join()

new Thread(p).join();

new Thread(p).join(100);

Files

python

java

print to standard output
 

print('Hello, World!')

System.out.print("Hello, World!");

read from standard input

line = sys.stdin.readline()

import java.util.Scanner;

Scanner sc=new Scanner(System.in);

int a=sc.nextInt();

standard file handles
 

sys.stdin sys.stdout sys.stderr

$stdin $stdout $stderr

open file
 

f = open('/etc/hosts')

import java.util.Scanner;

Scanner sc=new Scanner(new File(“/etc/host”));

open file for writing and if not exists then creates it
 

f = open('/tmp/test', 'w')

import java.io.BufferedWriter;
import java.io.FileWriter;
File fo=new File(”/etc/host”);

BufferedWriter fout = new BufferedWriter(new FileWriter(fo));

open file for append

with open('/tmp/test') as f:
  f.write('lorem ipsum\n')

import java.io.BufferedWriter;
import java.io.FileWriter;
BufferedWriter fout = new BufferedWriter(new FileWriter("/tmp/test2"));

close file
 

f.close()

f.close()

read line
 

f.readline()

// alternatively, use Scanner class

import java.io.BufferedReader;
import java.io.FileReader;
BufferedReader in = new BufferedReader(new FileReader("/etc/passwd"));
String line;
while ((line = in.readLine()) != null) {
  process line
}

iterate over file by line
 

for line in f:

while ((line = in.readLine()) != null) {
  process line
}

chomp
 

line = line.rstrip('\r\n')

String line = in.readLine();

line.replaceAll(”\n”,””);

read entire file into array or string

a = f.readlines()
s = f.read()

String line = in.readLine();

line.replaceAll(”\n”,””);

write to file
 

f.write('lorem ipsum')

import java.io.BufferedWriter;
import java.io.FileWriter;
BufferedWriter fout = new BufferedWriter(new FileWriter("/tmp/test2"));
int i;
for (i=0; i<10; i++) {
  fout.write(String.format("%d", i));
  fout.newLine();
}
fout.close();

flush file handle
 

f.flush()

fout.flush();

file test, regular file test
 

os.path.exists('/etc/hosts')
os.path.isfile('/etc/hosts')

File f=new File("/etc/hosts");

f.exists()

f.isFile();

f.isAbsolute();

copy file, remove file, rename file

import shutil

shutil.copy('/tmp/foo', '/tmp/bar')
os.remove('/tmp/foo')
shutil.move('/tmp/bar', '/tmp/foo')

File f=new File("/etc/hosts");

f.delete()

f.renameTo();

 

// copy a file

public static void copyFile(File sourceFile, File destFile) throws IOException {
    if(!destFile.exists()) {
        destFile.createNewFile();
    }

    FileChannel source = null;
    FileChannel destination = null;

    try {
        source = new FileInputStream(sourceFile).getChannel();
        destination = new FileOutputStream(destFile).getChannel();
        destination.transferFrom(source, 0, source.size());
    }
    finally {
        if(source != null) {
            source.close();
        }
        if(destination != null) {
            destination.close();
        }
    }
}

set file permissions

os.chmod('/tmp/foo', 0755)

f.setExecutable()

f.setReadble();

f.setWritable();

temporary file

import tempfile

f = tempfile.NamedTemporaryFile(
  prefix='foo')
f.write('lorem ipsum\n')
f.close()

print("tmp file: %s" % f.name)

File.createTempFile(name, suffix);

in memory file

from StringIO import StringIO

f = StringIO()
f.write('lorem ipsum\n')
s = f.getvalue()

Python 3 moved StringIO to the io module

directories

python

java

build pathname

os.path.join('/etc', 'hosts')

dirname and basename

os.path.dirname('/etc/hosts')
os.path.basename('/etc/hosts')

Only works for file

File f=new File(“/etc/hosts”);

f.getName();

absolute pathname

os.path.abspath('..')

File f=new File("/etc/hosts");

String abp=f.getAbsolutePath();

iterate over directory by file

for filename in os.listdir('/etc'):
  print(filename)

# alternative

for filename in glob.glob('/etc'+ '/*'):

   print(filename)

make directory

dirname = '/tmp/foo/bar'
if not os.path.isdir(dirname):
  os.makedirs(dirname)

File f=new File("/etc/hosts");

f.mkdir();

recursive copy

import shutil

shutil.copytree('/tmp/foodir',
  '/tmp/bardir')

remove empty directory

os.rmdir('/tmp/foodir')

f.delete();

remove directory and contents

import shutil

shutil.rmtree('/tmp/foodir')

directory test
 

os.path.isdir('/tmp')

f.isDirectory();

processes and environment

python

java

command line args, script name
 

len(sys.argv)-1
sys.argv[1] sys.argv[2] etc
sys.argv[0]

ARGV.size
ARGV[0] ARGV[1] etc
$0

getopt

import argparse

parser = argparse.ArgumentParser()
parser.add_argument('--file', '-f',
  dest='file')

args = parser.parse_args()
src = args.file

require 'getoptlong'

opts = GetoptLong.new(
  ['--help', '-h',
   GetoptLong::NO_ARGUMENT],
  ['--file', '-f',
   GetoptLong::REQUIRED_ARGUMENT]
)

opts.each do |opt, arg|
  case opt
  when '--file'
    src = arg
  when '--help'
    puts "usage: #{$0} -f SRC"
    exit -1
  end
end

get and set environment variable
 

os.getenv('HOME')

os.environ['PATH'] = '/bin'

ENV["HOME"]

ENV["PATH"] = "/bin"

exit
 

sys.exit(0)

exit(0)

set signal handller
 

import signal

def handler(signo, frame):
  print('exiting…')
  exit -1
signal.signal(signal.SIGINT, handler)

Signal.trap("INT",
  lambda do |signo|
    puts "exiting…"
    exit
  end
)

executable test

os.access('/bin/ls', os.X_OK)

File.executable?("/bin/ls")

external command
 

if os.system('ls -l /tmp'):
  raise Exception('ls failed')

unless system("ls -l /tmp")
  raise "ls failed"
end

escaped external command
 

import subprocess

cmd = ['ls', '-l', '/tmp']
if subprocess.call(cmd):
  raise Exception('ls failed')

path = gets
path.chomp!
unless system("ls", "-l", path)
  raise "ls failed"
end

backticks
 

import subprocess

cmd = ['ls', '-l', '/tmp']
files = subprocess.check_output(cmd)

files = `ls -l /tmp`
unless $?.success?
  raise "ls failed"
end

files = %x(ls)
unless $?.success?
  raise "ls failed"
end

libraries and modules

python

java

load library
 

import foo

require 'foo' # or
require 'foo.rb'

reload library
 

reload(foo)

load 'foo.rb'

library path
 

sys.path.append('/some/path')

$: << "/some/path"

library path environment variable

PYTHONPATH

RUBYLIB

library path command line option

none

-I

main in library

if __name__ == '__main__':
  code

if $0 == __FILE__
  code
end

module declaration
 

put declarations in foo.py

class Foo or module Foo

submodule declaration

create directory foo in library path containing file bar.py

module Foo::Bar or
module Foo
  module Bar

module separator
 

foo.bar.baz()

Foo::Bar.baz

import all definitions in module
 

from foo import *

include Foo

import definitions
 

from foo import bar, baz

none

managing multiple installations

$ virtualenv -p /usr/bin/python foo
$ source foo/bin/activate
$ echo $VIRTUAL_ENV
$ deactivate

$ ruby-build 1.9.3-p0 \
  ~/.rbenv/versions/foo
$ rbenv shell foo
$ rbenv version
$ rbenv shell system

list installed packages, install a package
 

$ pip freeze
$ pip install jinja2

$ gem list
$ gem install rails

package specification format

in setup.py:

#!/usr/bin/env python

from distutils.core import setup

setup(
  name='foo',
  author='Joe Foo',
  version='1.0',
  description='a package',
  py_modules=['foo'])

in foo.gemspec:

spec = Gem::Specification.new do |s|
  s.name = "foo"
  s.authors = "Joe Foo"
  s.version = "1.0"
  s.summary = "a gem"
  s.files = Dir["lib/*.rb"]
end

objects

python

java

define class
 

class Int:
  def __init__(self, v=0):
    self.value = v

class Int
  attr_accessor :value
  def initialize(i=0)
    @value = i
  end
end

create object
 

i = Int()
i2 = Int(7)

i = Int.new
i2 = Int.new(7)

get and set attribute
 

v = i.value
i.value = v+1

v = i.value
i.value = v+1

instance variable accessibility

public; attributes starting with underscore private by convention

private by default; use attr_reader, attr_writer, attr_accessor to make public

define method
 

def plus(self,v):
  return self.value + v

def plus(i)
  value + i
end

invoke method
 

i.plus(7)

i.plus(7)

destructor
 

def __del__(self):
  print('bye, %d' % self.value)

val = i.value
ObjectSpace.define_finalizer(int) {
  puts "bye, #{val}"
}

method missing
 

def __getattr__(self, name):
  s = 'no def: '+name+' arity: %d'
  return lambda *a: print(s % len(a))

def method_missing(name, *a)
  puts "no def: #{name}" +
    " arity: #{a.size}"
end

inheritance
 

class Counter(Int):
  instances = 0
  def __init__(self, v=0):
    Counter.instances += 1
    Int.__init__(self, v)
  def incr(self):
    self.value += 1

class Counter < Int
  @@instances = 0
  def initialize
    @@instances += 1
    super
  end
  def incr
    self.value += 1
  end
  def self.instances
    @@instances
  end
end

invoke class method
 

Counter.instances

Counter.instances

operator overloading

class Fixnum
  def /(n)
    self.fdiv(n)
  end
end

reflection

python

java

object id

id(o)

o.object_id

inspect type
 

type([]) == list

[].class == Array

basic types

NoneType
bool
int
long
float
str
SRE_Pattern
datetime
list
array
dict
object
file

NilClass
TrueClass
FalseClass
Fixnum
Bignum
Float
String
Regexp
Time
Array
Hash
Object
File

inspect class

o.__class__ == Foo
isinstance(o, Foo)

o.class == Foo
o.instance_of?(Foo)

inspect class hierarchy

o.__class__.__bases__

o.class.superclass
o.class.included_modules

has method?
 

hasattr(o, 'reverse')

o.respond_to?("reverse")

message passing
 

for i in range(1,10):
  getattr(o, 'phone'+str(i))(None)

(1..9).each do |i|
  o.send("phone#{i}=", nil)
end

eval
 

argument of eval must be an expression:
while True:
  print(eval(sys.stdin.readline()))

loop do
  puts eval(gets)
end

inspect methods
 

[m for m in dir(o)
  if callable(getattr(o,m))]

o.methods

inspect attributes
 

dir(o)

o.instance_variables

pretty print
 

import pprint

d = {'lorem':1, 'ipsum':[2,3]}
pprint.PrettyPrinter().pprint(d)

require 'pp'

d = {"lorem"=>1, "ipsum"=>[2,3]}
pp d

source line number and file name

import inspect

cf = inspect.currentframe()
cf.f_lineno
cf.f_code.co_filename

__LINE__
__FILE__

web

python

java

http get
 

import httplib

url = 'www.google.com'
f = httplib.HTTPConnection(url)
f.request("GET",'/')
s = f.getresponse().read()

require 'net/http'

url = 'www.google.com'
r = Net::HTTP.start(url, 80) do |f|
  f.get('/')
end
s = r.body

url encode/decode
 

# Python 3 location: urllib.parse
import urllib

urllib.quote_plus("lorem ipsum")
urllib.unquote_plus("lorem+ipsum")

require 'cgi'

CGI::escape('lorem ipsum')
CGI::unescape('lorem+ipsum')

base64 encode

import base64

s = open('foo.png').read()
print(base64.b64encode(s))

require 'base64'

s = File.open('foo.png').read
puts Base64.strict_encode64(tmp)

json

import json

s = json.dumps({'t':1, 'f':0})
d = json.loads(s)

Ruby 1.8: sudo gem install json

require 'json'

s = {'t'=> 1,'f'=> 0}.to_json
d = JSON.parse(s)

build xml

import xml.etree.ElementTree as ET

builder = ET.TreeBuilder()
builder.start('a', {})
builder.start('b', {})
builder.data('foo')
builder.end('b')
builder.end('a')

et = builder.close()
print(ET.tostring(et))

# gem install builder
require 'builder'

builder = Builder::XmlMarkup.new
xml = builder.a do |child|
  child.b("foo")
end
puts xml

parse xml

from xml.etree import ElementTree

xml = '<a><b>foo</b></a>'
doc = ElementTree.fromstring(xml)
print(doc[0].text)

require 'rexml/document'

xml = '<a><b>foo</b></a>'
doc = REXML::Document.new(xml)
puts doc[0][0].text

xpath

from xml.etree import ElementTree

xml = '<a><b><c>foo</c></b></a>'
doc = ElementTree.fromstring(xml)
node = doc.find("b/c")
print(node.text)

require 'rexml/document'
include REXML

xml = '<a><b><c>foo</c></b></a>'
doc = Document.new(xml)
node = XPath.first(doc,'/a/b/c')
puts node.text

tests

python

java

test class

import unittest

class FooTest(unittest.TestCase):
  def test_01(self):
    assert(True)

if __name__ == '__main__':
  unittest.main()

require 'test/unit'

class FooTest < Test::Unit::TestCase
  def test_01
    assert(true)
  end
end

run tests, run test method

$ python foo_test.py
$ python foo_test.py FooTest.test_01

$ ruby foo_test.rb
$ ruby foo_test.rb -n test_01

equality assertion

s = 'do re me'
self.assertEqual('do re me', s)

s = "do re me"
assert_equal("do re me", s)

regex assertion

s = 'lorem ipsum'
# uses re.search, not re.match:
self.assertRegexpMatches(s, 'lorem')

s = "lorem ipsum"
assert_match(/lorem/, s)

exception assertion

a = []
with self.assertRaises(IndexError):
  a[0]

assert_raises(ZeroDivisionError) do
  1 / 0
end

setup

# in class FooTest:
def setUp(self):
  print('setting up')

# in class FooTest:
def setup
  puts "setting up"
end

teardown

# in class FooTest:
def tearDown(self):
  print("tearing down")

# in class FooTest:
def teardown
  puts "tearing down"
end

debugging and profiling

python

java

check syntax
 

import py_compile

# precompile to bytecode:
py_compile.compile('foo.py')

$ ruby -c foo.rb

flags for stronger and strongest warnings

$ python -t foo.py
$ python -3t foo.py

$ ruby -w foo.rb
$ ruby -W2 foo.rb

lint

$ sudo pip install pylint
$ pylint foo.py

run debugger

$ python -m pdb foo.py

$ sudo gem install ruby-debug
$ rdebug foo.rb

debugger commands

h l n s b c w u d p q

h l n s b c w u down p q

benchmark code

import timeit

timeit.timeit('i += 1',
  'i = 0',
  number=1000000)

require 'benchmark'

n = 1_000_000
i = 0
puts Benchmark.measure do
  n.times { i += 1 }
end

profile code

$ python -m cProfile foo.py

$ sudo gem install ruby-prof
$ ruby-prof foo.rb

interop

python

java

version
 

Jython 2.5

JRuby 1.4

repl
 

$ jython

$ jirb

interpreter
 

$ jython

$ jruby

compiler
 

none in 2.5.1

$ jrubyc

prologue
 

import java

none

new
 

rnd = java.util.Random()

rnd = java.util.Random.new

method
 

rnd.nextFloat()

rnd.next_float

import
 

from java.util import Random

rnd = Random()

java_import java.util.Random
rnd = Random.new

non-bundled java libraries
 

import sys

sys.path.append('path/to/mycode.jar')
import MyClass

require 'path/to/mycode.jar'

shadowing avoidance
 

import java.io as javaio

module JavaIO
  include_package "java.io"
end

convert native array to java array
 

import jarray

jarray.array([1,2,3],'i')

[1,2,3].to_java(Java::int)

are java classes subclassable?
 

yes

yes

are java class open?
 

no

yes

General Footnotes

versions used

The versions used for testing code in the reference sheet.

implicit prologue

Code which examples in the sheet assume to have already been executed.

python:

To keep the examples short we assume that os, re, and sys are always imported.

show version

How to get the version.

python:

The following function will return the version number as a string:

 
import platform
platform.python_version()

interpreter

The customary name of the interpreter and how to invoke it.

repl

The customary name of the repl.

python:

The python repl saves the result of the last statement in _.

command line script

How to pass the code to be executed to the interpreter as a command line argument.

statement separator

How the parser determines the end of a statement.

python:

Newline does not terminate a statement when:

Python single quote '' and double quote "" strings cannot contain newlines except as the two character escaped form \n. Putting a newline in these strings results in a syntax error. There is however a multi-line string literal which starts and ends with three single quotes ''' or three double quotes: """.

A newline that would normally terminate a statement can be escaped with a backslash.

block delimiters

How blocks are delimited.

python:

Python blocks begin with a line that ends in a colon. The block ends with the first line that is not indented further than the initial line. Python raises an IndentationError if the statements in the block that are not in a nested block are not all indented the same. Using tabs in Python source code is unrecommended and many editors replace them automatically with spaces. If the Python interpreter encounters a tab, it is treated as 8 spaces.

The python repl switches from a >>> prompt to a … prompt inside a block. A blank line terminates the block.

java:

assignment

How to assign a value to a variable.

python:

If the variable on the left has not previously been defined in the current scope, then it is created. This may hide a variable in a containing scope.

Assignment does not return a value and cannot be used in an expression. Thus, assignment cannot be used in a conditional test, removing the possibility of using assignment (=) in place of an equality test (==). Assignments can nevertheless be chained to assign a value to multiple variables:

 
a = b = 3

java:

Assignment operators have right precedence and evaluate to the right argument, so they can be chained. If the variable on the left does not exist, then it is created.

parallel assignment

How to assign values to variables in parallel.

python:

The r-value can be a list or tuple:

 
nums = [1,2,3]
a,b,c = nums
more_nums = (6,7,8)
d,e,f = more_nums

Nested sequences of expression can be assigned to a nested sequences of l-values, provided the nesting matches. This assignment will set a to 1, b to 2, and c to 3:

 
(a,[b,c]) = [1,(2,3)]

This assignment will raise a TypeError:

 
(a,(b,c)) = ((1,2),3)

In Python 3 the splat operator * can be used to collect the remaining right side elements in a list:

 
x, y, *z = 1, 2        # assigns [] to z
x, y, *z = 1, 2, 3     # assigns [3] to z
x, y, *z = 1, 2, 3, 4  # assigns [3, 4] to z

java:

The r-value can be an array:

 
nums = [1,2,3]
a,b,c = nums

swap

How to swap the values held by two variables.

compound assignment

Compound assignment operators mutate a variable, setting it to the value of an operation which takes the value of the variable as an argument.

First row: arithmetic operator assignment: addition, subtraction, multiplication, (float) division, integer division, modulus, and exponentiation.
Second row: string concatenation assignment and string replication assignment
Third row: logical operator assignment: and, or, xor
Fourth row: bit operator assignment: left shift, right shift, and, or, xor.

python:

Python compound assignment operators do not return a value and hence cannot be used in expressions.

increment and decrement

The C-style increment and decrement operators can be used to increment or decrement values. They return values and thus can be used in expressions. The prefix versions return the value in the variable after mutation, and the postfix version return the value before mutation.

Incrementing a value two or more times in an expression makes the order of evaluation significant:

 
x = 1;
foo(++x, ++x); // foo(2, 3) or foo(3, 2)?
 
x = 1;
y = ++x/++x;  // y = 2/3 or y = 3/2?

Python avoids the problem by not having an in-expression increment or decrement.

Ruby mostly avoids the problem by providing a non-mutating increment and decrement. However, here is a Ruby expression which is dependent on order of evaluation:

 
x = 1
y = (x += 1)/(x += 1)

java:

The Integer class defines succ, pred, and next, which is a synonym for succ.

The String class defines succ, succ!, next, and next!. succ! and next! mutate the string.

local variable declarations

How to declare variables which are local to the scope defining region which immediately contain them.

python:

A variable is created by assignment if one does not already exist. If the variable is inside a function or method, then its scope is the body of the function or method. Otherwise it is a global.

regions which define local scope

A list of regions which define a scope for the local variables they contain.

Local variables defined inside the region are only in scope while code within the region is executing. If the language does not have closures, then code outside the region has no access to local variables defined inside the region. If the language does have closures, then code inside the region can make local variables accessible to code outside the region by returning a reference.

A region which is top level hides local variables in the scope which contains it from the code it contains. A region can also be top level if the syntax requirements of the language prohibit it from being placed inside another scope defining region.

A region is nestable if it can be placed inside another scope defining region, and if code in the inner region can access local variables in the outer region.

python:

Only functions and methods define scope. Function definitions can be nested. When this is done, inner scopes have read access to variables defined in outer scopes. Attempting to write (i.e. assign) to a variable defined in an outer scope will instead result in a variable getting created in the inner scope. Python trivia question: what would happen if the following code were executed?

 
def foo():
    v = 1
    def bar():
        print(v)
        v = 2
        print(v)
    bar()
 
foo()
 
x = 3
id = lambda { |x| x }
id.call(7)
puts x # 1.8 prints 7; 1.9 prints 3

global variable

How to declare and access a variable with global scope.

python:

A variable is global if it is defined at the top level of a file (i.e. outside any function definition). Although the variable is global, it must be imported individually or be prefixed with the module name prefix to be accessed from another file. To be accessed from inside a function or method it must be declared with the global keyword.

ruby:

A variable is global if it starts with a dollar sign: $.

constant declaration

How to declare a constant.

to-end-of-line comment

How to create a comment that ends at the next newline.

comment out multiple lines

How to comment out multiple lines.

python:

The triple single quote ''' and triple double quote """ syntax is a syntax for string literals.

null

The null literal.

null test

How to test if a variable contains null.

undefined variable access

The result of attempting to access an undefined variable.

undefined test

Arithmetic and Logic Footnotes

true and false

Literals for the booleans.

These are the return values of the comparison operators.

falsehoods

Values which behave like the false boolean in a conditional context.

Examples of conditional contexts are the conditional clause of an if statement and the test of a while loop.

python:

Whether a object evaluates to True or False in a boolean context can be customized by implementing a __nonzero__ (Python 2) or __bool__ (Python 3) instance method for the class.

logical operators

Logical and, or, and not.

conditional expression

How to write a conditional expression. A ternary operator is an operator which takes three arguments. Since

condition ? true value : false value

is the only ternary operator in C, it is unambiguous to refer to it as the ternary operator.

python:

The Python conditional expression comes from Algol.

comparison operators

Equality, inequality, greater than, less than, greater than or equal, less than or equal.

Also known as the relational operators.

python:

Comparison operators can be chained. The following expressions evaluate to true:

 
1 < 2 < 3
1 == 1 != 2

In general if Ai are expressions and opi are comparison operators, then

    A1 op1 A2 op2 A3 … An opn An+1

is true if and only if each of the following is true

    A1 op1 A2
    
A2 op2 A3
    
    
An opn An+1

three value comparison

Binary comparison operators which return -1, 0, or 1 depending upon whether the left argument is less than, equal to, or greater than the right argument.

The <=> symbol is called the spaceship operator.

convert from string, to string

How to convert string data to numeric data and vice versa.

python:

float and int raise an error if called on a string and any part of the string is not numeric.

arithmetic operators

The operators for addition, subtraction, multiplication, float division, integer division, modulus, and exponentiation.

integer division

How to get the integer quotient of two integers. How to get the integer quotient and remainder.

float division

How to perform floating point division, even if the operands might be integers.

arithmetic functions

Some arithmetic functions. Trigonometric functions are in radians unless otherwise noted. Logarithms are natural unless otherwise noted.

python:

Python also has math.log10. To compute the log of x for base b, use:

 
math.log(x)/math.log(b)

arithmetic truncation

How to truncate a float to the nearest integer towards zero; how to round a float to the nearest integer; how to find the nearest integer above a float; how to find the nearest integer below a float; how to take the absolute value.

min and max

How to get the min and max.

division by zero

What happens when division by zero is performed.

integer overflow

What happens when the largest representable integer is exceeded.

float overflow

What happens when the largest representable float is exceeded.

sqrt -2

The result of taking the square root of negative two.

rational numbers

How to create rational numbers and get the numerator and denominator.

complex numbers

python:

Most of the functions in math have analogues in cmath which will work correctly on complex numbers.

random integer, uniform float, normal float

How to generate a random integer between 0 and 99, include, float between zero and one in a uniform distribution, or a float in a normal distribution with mean zero and standard deviation one.

set random seed, get and restore seed

How to set the random seed; how to get the current random seed and later restore it.

All the languages in the sheet set the seed automatically to a value that is difficult to predict. The Ruby 1.9 MRI interpreter uses the current time and process ID, for example. As a result there is usually no need to set the seed.

Setting the seed to a hardcoded value yields a random but repeatable sequence of numbers. This can be used to ensure that unit tests which cover code using random numbers doesn't intermittently fail.

The seed is global state. If multiple functions are generating random numbers then saving and restoring the seed may be necessary to produce a repeatable sequence.

bit operators

The bit operators for left shift, right shift, and, inclusive or, exclusive or, and negation.

binary, octal, and hex literals

Binary, octal, and hex integer literals

base conversion

How to convert integers to strings of digits of a given base. How to convert such strings into integers.

python

Python has the functions bin, oct, and hex which take an integer and return a string encoding the integer in base 2, 8, and 16.

 
bin(42)
oct(42)
hex(42)

String Footnotes

string literal

The syntax for string literals.

newline in literal

Whether newlines are permitted in string literals.

python:

Newlines are not permitted in single quote and double quote string literals. A string can continue onto the following line if the last character on the line is a backslash. In this case, neither the backslash nor the newline are taken to be part of the string.

Triple quote literals, which are string literals terminated by three single quotes or three double quotes, can contain newlines:

 
'''This is
two lines'''
 
"""This is also
two lines"""

character escapes

Backslash escape sequences for inserting special characters into string literals.

unrecognized backslash escape sequence

double quote

single quote

Python

preserve backslash

preserve backslash

python:

When string literals have an r or R prefix there are no backslash escape sequences and any backslashes thus appear in the created string. The delimiter can be inserted into a string if it is preceded by a backslash, but the backslash is also inserted. It is thus not possible to create a string with an r or R prefix that ends in a backslash. The r and R prefixes can be used with single or double quotes:

 
r'C:\Documents and Settings\Admin'
r"C:\Windows\System32"

The \uhhhh escapes are also available inside Python 2 Unicode literals. Unicode literals have a u prefiix:

 
u'lambda: \u03bb'

variable interpolation

How to interpolate variables into strings.

python:

str.format will take named or positional parameters. When used with named parameters str.format can mimic the variable interpolation feature of the other languages.

A selection of variables in scope can be passed explicitly:

 
count = 3
item = 'ball'
print('{count} {item}s'.format(
  count=count,
  item=item))

Python 3 has format_map which accepts a dict as an argument:

 
count = 3
item = 'ball'
print('{count} {item}s'.format_map(locals()))

custom delimiters

How to specify custom delimiters for single and double quoted strings. These can be used to avoid backslash escaping. If the left delimiter is (, [, or { the right delimiter must be ), ], or }, respectively.

sprintf

How to create a string using a printf style format.

python:

The % operator will interpolate arguments into printf-style format strings.

The str.format with positional parameters provides an alternative format using curly braces {0}, {1}, … for replacement fields.

The curly braces are escaped by doubling:

 
'to insert parameter {0} into a format, use {{{0}}}'.format(3)

If the replacement fields appear in sequential order and aren't repeated, the numbers can be omitted:

 
'lorem {} {} {}'.format('ipsum', 13, 3.7)

here document

Here documents are strings terminated by a custom identifier. They perform variable substitution and honor the same backslash escapes as double quoted strings.

python:

Python lacks variable interpolation in strings. Triple quotes honor the same backslash escape sequences as regular quotes, so triple quotes can otherwise be used like here documents:

 
s = '''here document
there computer
'''

concatenate

The string concatenation operator.

replicate

The string replication operator.

split, in two, with delimiters, into characters

How to split a string containing a separator into an array of substrings; how to split a string in two; how to split a string with the delimiters preserved as separate elements; how to split a string into an array of single character strings.

python:

str.split() takes simple strings as delimiters; use re.split() to split on a regular expression:

 
re.split('\s+', 'do re mi fa')
re.split('\s+', 'do re mi fa', 1)

join

How to concatenate the elements of an array into a string with a separator.

case manipulation

How to put a string into all caps or all lower case letters. How to capitalize the first letter of a string.

strip

How to remove whitespace from the ends of a string.

pad on right, on left

How to pad the edge of a string with spaces so that it is a prescribed length.

length

How to get the length in characters of a string.

index of substring

How to find the index of the leftmost occurrence of a substring in a string; how to find the index of the rightmost occurrence.

python:

Methods for splitting a string into three parts using the first or last occurrence of a substring:

 
'do re re mi'.partition('re')     # returns ('do ', 're', ' re mi')
'do re re mi'.rpartition('re')    # returns ('do re ', 're', ' mi')

extract substring

How to extract a substring from a string by index.

extract character

How to extract a character from a string by its index.

chr and ord

Converting characters to ASCII codes and back.

The languages in this reference sheet do not have character literals, so characters are represented by strings of length one.

character translation

How to apply a character mapping to a string.

Regular Expressions

Regular expressions or regexes are a way of specifying sets of strings. If a string belongs to the set, the string and regex "match". Regexes can also be used to parse strings.

The modern notation for regexes was introduced by Unix command line tools in the 1970s. POSIX standardized the notation into two types: extended regexes and the more archaic basic regexes. Perl regexes are extended regexes augmented by new character class abbreviations and a few other features introduced by the Perl interpreter in the 1990s. All the languages in this sheet use Perl regexes.

Any string that doesn't contain regex metacharacters is a regex which matches itself. The regex metacharacters are: [ ] . | ( ) * + ? { } ^ $ \

character classes: [ ] .

A character class is a set of characters in brackets: [ ]. When used in a regex it matches any character it contains.

Character classes have their own set of metacharacters: ^ - \ ]

The ^ is only special when it the first character in the character class. Such a character class matches its complement; that is, any character not inside the brackets. When not the first character the ^ refers to itself.

The hypen is used to specify character ranges: e.g. 0-9 or A-Z. When the hyphen is first or last inside the brackets it matches itself.

The backslash can be used to escape the above characters or the terminal character class delimiter: ]. It can be used in character class abbreviations or string backslash escapes.

The period . is a character class abbreviation which matches any character except for newline. In all languages the period can be made to match all characters. In PHP and Perl use the m modifer. In Python use the re.M flag. In Ruby use the s modifer.


character class abbreviations:

abbrev

name

character class

\d

digit

[0-9]

\D

nondigit

[^0-9]

\h

PHP, Perl: horizontal whitespace character
Ruby: hex digit

PHP, Perl: [ \t]
Ruby: [0-9a-fA-F]

\H

PHP, Perl: not a horizontal whitespace character
Ruby: not a hex digit

PHP, Perl: [^ \t]
Ruby: [^0-9a-fA-F]

\s

whitespace character

[ \t\r\n\f]

\S

non whitespace character

[^ \t\r\n\f]

\v

vertical whitespace character

[\r\n\f]

\V

not a vertical whitespace character

[^\r\n\f]

\w

word character

[A-Za-z0-9_]

\W

non word character

[^A-Za-z0-9_]

alternation and grouping: | ( )

The vertical pipe | is used for alternation and parens () for grouping.

A vertical pipe takes as its arguments everything up to the next vertical pipe, enclosing paren, or end of string.

Parentheses control the scope of alternation and the quantifiers described below. The are also used for capturing groups, which are the substrings which matched parenthesized parts of the regular expression. Each language numbers the groups and provides a mechanism for extracting when a match is made. A parenthesized subexpression can be removed from the groups with this syntax: (?:expr)

quantifiers: * + ? { }

As an argument quantifiers take the preceding regular character, character class, or group. The argument can itself be quantified, so that ^a{4}*$ matches strings with the letter a in multiples of 4.

quantifier

# of occurrences of argument matched

*

zero or more, greedy

+

one or more, greedy

?

zero or one, greedy

{m,n}

m to n, greedy

{n}

exactly n

{m,}

m or more, greedy

{,n}

zero to n, greedy

*?

zero or more, lazy

+?

one or more, lazy

{m,n}?

m to n, lazy

{m,}?

m or more, lazy

{,n}?

zero to n, lazy

When there is a choice, greedy quantifiers will match the maximum possible number of occurrences of the argument. Lazy quantifiers match the minimum possible number.


anchors: ^ $

anchor

matches

^

beginning of a string. In Ruby or when m modifier is used also matches right side of a newline

$

end of a string. In Ruby or when m modifier is used also matches left side of a newline

\A

beginning of the string

\b

word boundary. In between a \w and a \W character or in between a \w character and the edge of the string

\B

not a word boundary. In between two \w characters or two \W characters

\z

end of the string

\Z

end of the string unless it is a newline, in which case it matches the left side of the terminal newline

escaping: \

To match a metacharacter, put a backslash in front of it. To match a backslash use two backslashes.

literal, custom delimited literal

The literal for a regular expression; the literal for a regular expression with a custom delimiter.

python:

Python does not have a regex literal, but the re.compile function can be used to create regex objects.

Compiling regexes can always be avoided:

 
re.compile('\d{4}').search('1999')
re.search('\d{4}', '1999')
 
re.compile('foo').sub('bar', 'foo bar')
re.sub('foo', 'bar', 'foo bar')
 
re.compile('\w+').findall('do re me')
re.findall('\w+', 'do re me')

character class abbreviations and anchors

The supported character class abbreviations and anchors.

Note that \h refers to horizontal whitespace (i.e. a space or tab) in PHP and Perl and a hex digit in Ruby. Similarly \H refers to something that isn't horizontal whitespace in PHP and Perl and isn't a hex digit in Ruby.

match test

How to test whether a string matches a regular expression.

python:

The re.match function is like the re.search function, except that it is only true if the regular expression matches the entire string.

case insensitive match test

How to perform a case insensitive match test.

modifiers

Modifiers that can be used to adjust the behavior of a regular expression.

The lists are not comprehensive. For all languages except Ruby there are additional modifiers.

modifier

behavior

e

PHP: when used with preg_replace, the replacement string, after backreferences are substituted, is eval'ed as PHP code and the result is used as the replacement.

i, re.I

all: ignores case. Upper case letters match lower case letters and vice versa.

m, re.M

PHP, Perl, Python: makes the ^ and $ match the right and left edge of newlines in addition to the beginning and end of the string.
Ruby: makes the period . match newline characters.

o

Ruby: performs variable interpolation #{ } only once per execution of the program.

p

Perl: sets ${^MATCH} ${^PREMATCH} and ${^POSTMATCH}

s, re.S

PHP, Perl, Python: makes the period . match newline characters.

x, re.X

all: ignores whitespace in the regex which permits it to be used for formatting.

Python modifiers are bit flags. To use more than one flag at the same time, join them with bit or: |

substitution

How to replace all occurrences of a matching pattern in a string with the provided substitution string.

python:

The 3rd argument to sub controls the number of occurrences which are replaced.

 
s = 'foo bar bar'
re.compile('bar').sub('baz', s, 1)

If there is no 3rd argument, all occurrences are replaced.

match, prematch, postmatch

How to get the substring that matched the regular expression, as well as the part of the string before and after the matching substring.

group capture

How to get the substrings which matched the parenthesized parts of a regular expression.

scan

How to return all non-overlapping substrings which match a regular expression as an array.

backreference in match and substitution

How to use backreferences in a regex; how to use backreferences in the replacement string of substitution.

recursive regex

Examples of recursive regexes.

The examples match substrings containing balanced parens.

Date and Time Footnotes

In ISO 8601 terminology, a date specifies a day in the Gregorian calendar and a time does not contain date information; it merely specifies a time of day. A data type which combines both date and time information is probably more useful than one which contains just date information or just time information; it is unfortunate that ISO 8601 doesn't provide a name for this entity. The word timestamp often gets used to denote a combined date and time. PHP and Python use the compound noun datetime for combined date and time values.

An useful property of ISO 8601 dates, times, and date/time combinations is that they are correctly ordered by a lexical sort on their string representations. This is because they are big-endian (the year is the leftmost element) and they used fixed-length fields for each term in the string representation.

The C standard library provides two methods for representing dates. The first is the UNIX epoch, which is the seconds since January 1, 1970 in UTC. If such a time were stored in a 32-bit signed integer, the rollover would happen on January 18, 2038.

The other method of representing dates is the tm struct, a definition of which can be found on Unix systems in /usr/include/time.h:

 
struct tm {
        int     tm_sec;         /* seconds after the minute [0-60] */
        int     tm_min;         /* minutes after the hour [0-59] */
        int     tm_hour;        /* hours since midnight [0-23] */
        int     tm_mday;        /* day of the month [1-31] */
        int     tm_mon;         /* months since January [0-11] */
        int     tm_year;        /* years since 1900 */
        int     tm_wday;        /* days since Sunday [0-6] */
        int     tm_yday;        /* days since January 1 [0-365] */
        int     tm_isdst;       /* Daylight Savings Time flag */
        long    tm_gmtoff;      /* offset from CUT in seconds */
        char    *tm_zone;       /* timezone abbreviation */
};

Python uses and expose the tm struct of the standard library. In the case of Perl, the first nine values of the struct (up to the member tm_isdst) are put into an array. Python, meanwhile, has a module called time which is a thin wrapper to the standard library functions which operate on this struct. Here is how get a tm struct in Python:

 
import time
 
utc = time.gmtime(time.time())
t = time.localtime(time.time())

The tm struct is a low level entity, and interacting with it directly should be avoided. In the case of Python it is usually sufficient to use the datetime module instead. For Perl, one can use the Time::Piece module to wrap the tm struct in an object.

date/time type

The data type used to hold a combined date and time.

current date/time

How to get the combined date and time for the present moment in both local time and UTC.

to unix epoch, from unix epoch

How to convert the native date/time type to the Unix epoch which is the number of seconds since the start of January 1, 1970 UTC.

current unix epoch

How to get the current time as a Unix epoch timestamp.

strftime

How to format a date/time as a string using the format notation of the strftime function from the standard C library. This same format notation is used by the Unix date command.

get fields such as HOUR_OF_DAY

Given a date object, how to get fields such as HOUR_OF_DAY, DAY_OF_WEEK, SECOND, MINUTE, etc.

java:

Java’s util.Calendar has some counterintuitive use cases. For the DAY_OF_WEEK field, this field “takes values SUNDAY, MONDAY, TUESDAY, WEDNESDAY, THURSDAY, FRIDAY, and SATURDAY." When your program prints 6, it's telling you that the day of week is FRIDAY. This constant value has nothing to do with the beginning of the week - it's just a constant that means FRIDAY. It does coincidentally happen to be the same as the day of the week if the first day of the week is SUNDAY (1) - but it doesn't change if the first day of the week is redefined. Compare this to the API for WEEK_OF_MONTH and WEEK_OF_YEAR, which do say that they depend on the first day of the week. You can test to see if this works correctly for your purposes. If you really need a number representing day of week with 1 meaning Monday and 7 meaning Sunday, you can get it with a workaround.

int dayOfWeek = myCalendar.get(Calendar.DAY_OF_WEEK) - 1;
if (dayOfWeek == 0)
    dayOfWeek = 7;

default format example

Examples of how a date/time object appears when treated as a string such as when it is printed to standard out.

The formats are in all likelihood locale dependent. The provided examples come from a machine running Mac OS X in the Pacific time zone of the USA.

strptime

How to parse a date/time using the format notation of the strptime function from the standard C library.

parse date w/o format

How to parse a date without providing a format string.

result date subtraction

The data type that results when subtraction is performed on two combined date and time values.

add time duration

How to add a time duration to a date/time.

A time duration can easily be added to a date/time value when the value is a Unix epoch value.

ISO 8601 distinguishes between a time interval, which is defined by two date/time endpoints, and a duration, which is the length of a time interval and can be defined by a unit of time such as '10 minutes'. A time interval can also be defined by date and time representing the start of the interval and a duration.

ISO 8601 defines notation for durations. This notation starts with a 'P' and uses a 'T' to separate the day and larger units from the hour and smaller units. Observing the location relative to the 'T' is important for interpreting the letter 'M', which is used for both months and minutes.

local timezone

Do date/time values include timezone information. When a date/time value for the local time is created, how the local timezone is determined.

A date/time value can represent a local time but not have any timezone information associated with it.

On Unix systems processes determine the local timezone by inspecting the file /etc/localtime.

timezone name, offset from UTC, is daylight savings?

How to get time zone information: the name of the timezone, the offset in hours from UTC, and whether the timezone is currently in daylight savings.

Timezones are often identified by three or four letter abbreviations. As can be seen from the list, many of the abbreviations do not uniquely identify a timezone. Furthermore many of the timezones have been altered in the past. The Olson database (aka Tz database) decomposes the world into zones in which the local clocks have all been set to the same time since 1970 and it gives these zones unique names.

microseconds

How to get the microseconds component of a combined date and time value. The SI abbreviations for milliseconds and microseconds are ms and μs, respectively. The C standard library uses the letter u as an abbreviation for micro. Here is a struct defined in /usr/include/sys/time.h:

 
struct timeval {
  time_t       tv_sec;   /* seconds since Jan. 1, 1970 */
  suseconds_t  tv_usec;  /* and microseconds */
};

sleep

How to put the process to sleep for a specified number of seconds. In Python and Ruby the default version of sleep supports a fractional number of seconds.

timeout

How to cause a process to timeout if it takes too long.

Techniques relying on SIGALRM only work on Unix systems.

Array Footnotes

What the languages call their basic container types:

python

java

array

list, tuple, sequence

array, List

dictionary

dict, mapping

Map

python:

Python has the mutable list and the immutable tuple. Both are sequences. To be a sequence, a class must implement __getitem__, __setitem__, __delitem__, __len__, __contains__, __iter__, __add__, __mul__, __radd__, and __rmul__.

java:

Java provides powerful generics containers for list, set, dictionary, etc.

literal

Array literal syntax.

quote words

The quote words operator, which is a literal for arrays of strings where each string contains a single word.

size

How to get the number of elements in an array.

empty test

How to test whether an array is empty.

lookup

How to access a value in an array by index.

python:

A negative index refers to the length - index element.

 
>>> a = [1,2,3]
>>> a[-1]
3

update

How to update the value at an index.

out-of-bounds behavior

What happens when the value at an out-of-bounds index is refererenced.

index of array element

Some techniques for getting the index of an array element.

slice by endpoints, by length

How to slice a subarray from an array by specifying a start index and an end index; how to slice a subarray from an array by specifying an offset index and a length index.

python:

Slices can leave the first or last index unspecified, in which case the first or last index of the sequence is used:

 
>>> a=[1,2,3,4,5]
>>> a[:3]
[1, 2, 3]

Python has notation for taking every nth element:

 
>>> a=[1,2,3,4,5]
>>> a[::2] 
[1, 3, 5]

The third argument in the colon-delimited slice argument can be negative, which reverses the order of the result:

 
>>> a = [1,2,3,4]
>>> a[::-1]
[4, 3, 2, 1]

slice to end

How to slice to the end of an array.

The examples take all but the first element of the array.

manipulate back

How to add and remove elements from the back or high index end of an array.

These operations can be used to use the array as a stack.

manipulate front

How to add and remove elements from the front or low index end of an array.

These operations can be used to use the array as a stack. They can be used with the operations that manipulate the back of the array to use the array as a queue.

concatenate

How to create an array by concatenating two arrays; how to modify an array by concatenating another array to the end of it.

replicate

How to create an array containing the same value replicated n times.

address copy, shallow copy, deep copy

How to make an address copy, a shallow copy, and a deep copy of an array.

After an address copy is made, modifications to the copy also modify the original array.

After a shallow copy is made, the addition, removal, or replacement of elements in the copy does not modify of the original array. However, if elements in the copy are modified, those elements are also modified in the original array.

A deep copy is a recursive copy. The original array is copied and a deep copy is performed on all elements of the array. No change to the contents of the copy will modify the contents of the original array.

python:

The slice operator can be used to make a shallow copy:

 
a2 = a[:]

list(v) always returns a list, but v[:] returns a value of the same as v. The slice operator can be used in this manner on strings and tuples but there is little incentive to do so since both are immutable.

copy.copy can be used to make a shallow copy on types that don't support the slice operator such as a dictionary. Like the slice operator copy.copy returns a value with the same type as the argument.

arrays as function arguments

How arrays are passed as arguments.

iteration

How to iterate through the elements of an array.

indexed iteration

How to iterate through the elements of an array while keeping track of the index of each element.

iterate over range

Iterate over a range without instantiating it as a list.

instantiate range as array

How to convert a range to an array.

Python 3 ranges and Ruby ranges implement some of the functionality of arrays without allocating space to hold all the elements.

python:

In Python 2 range() returns a list.

In Python 3 range() returns an object which implements the immutable sequence API.

reverse

How to create a reversed copy of an array, and how to reverse an array in place.

python:

reversed returns an iterator which can be used in a for/in construct:

 
print("counting down:")
for i in reversed([1,2,3]):
  print(i)

reversed can be used to create a reversed list:

 
a = list(reversed([1,2,3]))

sort

How to create a sorted copy of an array, and how to sort an array in place. Also, how to set the comparison function when sorting.

dedupe

How to remove extra occurrences of elements from an array.

python:

Python sets support the len, in, and for operators. It may be more efficient to work with the result of the set constructor directly rather than convert it back to a list.

Set Footnotes

What the languages call their basic container types:

membership

How to test for membership in an array.

intersection

How to compute an intersection.

python:

Python has literal notation for sets:

 
{1,2,3}

Use set and list to convert lists to sets and vice versa:

 
a = list({1,2,3})
ensemble = set([1,2,3])

union

relative complement, symmetric difference

How to compute the relative complement of two arrays or sets; how to compute the symmetric difference.

pick random element

How to pick a random element from a set.

map

Create an array by applying a function to each element of a source array.

filter

Create an array containing the elements of a source array which match a predicate.

reduce

Return the result of applying a binary operator to all the elements of the array.

python:

reduce is not needed to sum a list of numbers:

 
sum([1,2,3])

universal and existential tests

How to test whether a condition holds for all members of an array; how to test whether a condition holds for at least one member of any array.

A universal test is always true for an empty array. An existential test is always false for an empty array.

A existential test can readily be implemented with a filter. A universal test can also be implemented with a filter, but it is more work: one must set the condition of the filter to the negation of the predicate and test whether the result is empty.

shuffle and sample

How to shuffle an array. How to extract a random sample from an array.

zip

How to interleave arrays. In the case of two arrays the result is an array of pairs or an associative list.

java:

java does not have language level support for this function. To implement it in Java, it can be done as follows:

1. Define a Pair class
2. Create a list of Pair Objects.
2. Loop through the two input lists and zip them to the list of pairs.

Dictionary Footnotes

literal

size

How to get the number of dictionary keys in a dictionary.

lookup

How to lookup a dictionary value using a dictionary key.

out-of-bounds behavior

What happens when a lookup is performed on a key that is not in a dictionary.

python:

Use dict.get() to avoid handling KeyError exceptions:

 
d = {}
d.get('lorem')      # returns None
d.get('lorem', '')  # returns ''

is key present

How to check for the presence of a key in a dictionary without raising an exception. Distinguishes from the case where the key is present but mapped to null or a value which evaluates to false.

delete entry

How to remove a key/value pair from a dictionary.

from array of pairs, from even length array

How to create a dictionary from an array of pairs; how to create a dictionary from an even length array.

merge

How to merge the values of two dictionaries.

In the examples, if the dictionaries d1 and d2 share keys then the values from d2 will be used in the merged dictionary.

invert

How to turn a dictionary into its inverse. If a key 'foo' is mapped to value 'bar' by a dictionary, then its inverse will map the key 'bar' to the value 'foo'. However, if multiple keys are mapped to the same value in the original dictionary, then some of the keys will be discarded in the inverse.

iteration

How to iterate through the key/value pairs in a dictionary.

python:

In Python 2.7 dict.items() returns a list of pairs and dict.iteritems() returns an iterator on the list of pairs.

In Python 3 dict.items() returns an iterator and dict.iteritems() has been removed.

keys and values as arrays

How to convert the keys of a dictionary to an array; how to convert the values of a dictionary to an array.

python:

In Python 3 dict.keys() and dict.values() return read-only views into the dict. The following code illustrates the change in behavior:

 
d = {}
keys = d.keys()
d['foo'] = 'bar'
 
if 'foo' in keys:
  print('running Python 3')
else:
  print('running Python 2')

default value, computed value

How to create a dictionary with a default value for missing keys; how to compute and store the value on lookup.

Function Footnotes

Python has both functions and methods. Ruby only has methods: functions defined at the top level are in fact methods on a special main object. Perl subroutines can be invoked with a function syntax or a method syntax.

function declaration

How to define a function.

function invocation

How to invoke a function.

python:

When invoking methods and functions, parens are mandatory, even for functions which take no arguments. Omitting the parens returns the function or method as an object. Whitespace can occur between the function name and the following left paren.

Starting with 3.0, print is treated as a function instead of a keyword. Thus parens are mandatory around the print argument.

missing argument behavior

How incorrect number of arguments upon invocation are handled.

python:

TypeError is raised if the number of arguments is incorrect.

default value

How to declare a default value for an argument.

variable number of arguments

How to write a function which accepts a variable number of argument.

python:

This function accepts one or more arguments. Invoking it without any arguments raises a TypeError:

 
def poker(dealer, *players):
  ...

named parameters

How to write a function which uses named parameters and how to invoke it.

python:

In a function definition, the splat operator * collects the remaining arguments into a list. In a function invocation, the splat can be used to expand an array into separate arguments.

In a function definition the double splat operator ** collects named parameters into a dictionary. In a function invocation, the double splat expands a dictionary into named parameters.

In Python 3 named parameters can be made mandatory:

 
def fequal(x, y, *, eps):
  return abs(x-y) < eps
 
fequal(1.0, 1.001, eps=0.01)  # True
 
fequal(1.0, 1.001)                 # raises TypeError

pass number or string by reference

How to pass numbers or strings by reference.

The three common methods of parameter passing are pass by value, pass by reference, and pass by address. Pass by value is the default in most languages.

When a parameter is passed by reference, the callee can changed the value in the variable that was provided as a parameter, and the caller will see the new value when the callee returns. When the parameter is passed by value the callee cannot do this.

When a language has mutable data types it can be unclear whether the language is using pass by value or pass by reference.

pass array or dictionary by reference

How to pass an array or dictionary without making a copy of it.

return value

How the return value of a function is determined.

multiple return values

How to return multiple values from a function.

lambda declaration and invocation

How to define and invoke a lambda function.

python:

Python lambdas cannot contain newlines or semicolons, and thus are limited to a single statement or expression. Unlike named functions, the value of the last statement or expression is returned, and a return is not necessary or permitted. Lambdas are closures and can refer to local variables in scope, even if they are returned from that scope.

If a closure function is needed that contains more than one statement, use a nested function:

 
def make_nest(x):
    b = 37
    def nest(y):
        c = x*y
        c *= b
        return c
    return nest
 
n = make_nest(12*2)
print(n(23))

Python closures are read only.

A nested function can be returned and hence be invoked outside of its containing function, but it is not visible by its name outside of its containing function.

function reference

How to store a function in a variable.

python:

Python function are stored in variables by default. As a result a function and a variable with the same name cannot share the same scope. This is also the reason parens are mandatory when invoking Python functions.

function with private state

How to create a function with private state which persists between function invocations.

python:

Here is a technique for creating private state which exploits the fact that the expression for a default value is evaluated only once:

 
def counter(_state=[0]):
  _state[0] += 1
  return _state[0]
 
print(counter())

closure

How to create a first class function with access to the local variables of the local scope in which it was created.

python:

Python 2 has limited closures: access to local variables in the containing scope is read only and the bodies of anonymous functions must consist of a single expression.

Python 3 permits write access to local variables outside the immediate scope when declared with nonlocal.

generator

How to create a function which can yield a value back to its caller and suspend execution.

python:

Python generators can be used in for/in statements and list comprehensions.

Execution Control Footnotes

if

The if statement.

switch

The switch statement.

while

c-style for

How to write a C-style for loop.

break, continue, redo

break exits a for or while loop immediately. continue goes to the next iteration of the loop. redo goes back to the beginning of the current iteration.

control structure keywords

A list of control structure keywords. The loop control keywords from the previous line are excluded.

The list summarizes the available control structures. It excludes the keywords for exception handling, loading libraries, and returning from functions.

what do does

How the do keyword is used.

statement modifiers

Clauses added to the end of a statement to control execution.

raise exception

How to raise exceptions.

catch exception

How to catch exceptions.

global variable for last exception

The global variable name for the last exception raised.

define exception

How to define a new variable class.

catch exception by type

How to catch exceptions of a specific type and assign the exception a name.

finally/ensure

Clauses that are guaranteed to be executed even if an exception is thrown or caught.

start thread

wait on thread

How to make a thread wait for another thread to finish.

File Footnotes

print to standard output

python:

print appends a newline to the output. To suppress this behavior, put a trailing comma after the last argument. If given multiple arguments, print joins them with spaces.

In Python 2 print parses as a keyword and parentheses are not required:

 
print "Hello, World!"

read from standard input

How to read from standard input.

standard file handles

The names for standard input, standard output, and standard error.

open file

open file for writing

How to open a file for writing. If the file exists its contents will be overwritten.

open file for append

How to open a file with the seek point at the end of the file. If the file exists its contents will be preserved.

close file

How to close a file.

read line

How to read up to the next newline in a file.

iterate over file by line

How to iterate over a file line by line.

chomp

Remove a newline, carriage return, or carriage return newline pair from the end of a line if there is one.

python:

Python strings are immutable. rstrip returns a modified copy of the string. rstrip('\r\n') is not identical to chomp because it removes all contiguous carriage returns and newlines at the end of the string.

read entire file into array or string

How to read the contents of a file into memory.

write to file

How to write to a file handle.

flush file handle

How to flush a file handle that has been written to.

file test, regular file test

How to test whether a file exists; how to test whether a file is a regular file (i.e. not a directory, special device, or named pipe).

copy file, remove file, rename file

How to copy a file; how to remove a file; how to rename a file.

java:

Apache Commons IO is the way to go, specifically FileUtils.copyFile().

set file permissions

How to set the permissions on the file.

For Perl, Python, and Ruby, the mode argument is in the same format as the one used with the Unix chmod command. It uses bitmasking to get the various permissions which is why it is normally an octal literal.

The mode argument should not be provided as a string such as "0755". Python and Ruby will raise an exception if a string is provided. Perl will convert "0755" to 755 and not 0755 which is equal to 493 in decimal.

temporary file

How to create and use a temporary file.

Temporary file libraries solve two problems: (1) finding a unused pathname, and (2) putting the file in a location where the system will eventually remove it should the application fail to clean up after itself.

in memory file

How to create a file descriptor which writes to an in-memory buffer.

Directory Footnotes

build pathname

How to construct a pathname without hard coding the system file separator.

dirname and basename

How to extract the directory portion of a pathname; how to extract the non-directory portion of a pathname.

absolute pathname

How to get the get the absolute pathname for a pathname. If the pathname is relative the current working directory will be appended.

In the examples provided, if /foo/bar is the current working directory and .. is the relative path, then the return value is foo

iterate over directory by file

How to iterate through the files in a directory.

python:

file() is the file handle constructor. file can be used as a local variable name but doing so hides the constructor. It can still be invoked by the synonym open(), however.

make directory

How to create a directory.

If needed, the examples will create more than one directory.

No error will result if a directory at the pathname already exists. An exception will be raised if the pathname is occupied by a regular file, however.

recursive copy

How to perform a recursive copy. If the source is a directory, then the directory and all its contents will be copied.

remove empty directory

How to remove an empty directory. The operation will fail if the directory is not empty.

remove directory and contents

How to remove a directory and all its contents.

directory test

How to determine if a pathname is a directory.

Processes and Environment Footnotes

command line arguments, script name

How to access arguments provided at the command line when the script was run; how to get the name of the script.

getopt

How to process command line options.

Command line options are arguments which start with a special character such as a hyphen '-'. Command line option libraries remove options arguments from the ARGV array but leave other arguments for later processing.

get and set environment variable

How to get and set an environment variable. If an environment variable is set the new value is inherited by child processes.

exit

python:

It is possible to register code to be executed upon exit:

 
import atexit
atexit.register(print, "goodbye")

It is possible to terminate a script without executing registered exit code by calling os._exit.

set signal handler

How to register a signal handling function.

executable test

How to test whether a file is executable.

external command

How to execute an external command.

escaped external command

How to prevent shell injection.

backticks

How to invoke an external command and read its output into a variable.

The use of backticks for this operation goes back to the Bourne shell (1977).

python:

A more concise solution is:

 
file = os.popen('ls -l /tmp').read()

os.popen was marked as deprecated in Python 2.6 but it is still available in Python 2.7 and Python 3.2.

Library and Module Footnotes

How terminology is used in this sheet:

A few notes:

According to our terminology, Perl and Java packages are modules, not packages.

PHP and C++ namespaces are another of example of modules.

We prefer to reserve the term namespace for divisions of the set of names imposed by the parser. For example, the identifier foo in the Perl variables $foo and @foo belong to different namespaces. Another example of namespaces in this sense is the Lisp-1 vs. Lisp-2 distinction: Scheme is a Lisp-1 and has a single namespace, whereas Common Lisp is a Lisp-2 and has multiple namespaces.

Some languages (e.g. Python, Java) impose a one-to-one mapping between libraries and modules. All the definitions for a module must be in a single file, and there are typically restrictions on how the file must be named and where it is located on the filesystem. Other languages allow the definitions for a module to be spread over multiple files or permit a file to contain multiple modules. Ruby and C++ are such languages.

load library

Execute the specified file. Normally this is used on a file which only contains declarations at the top level.

reload library

How to reload a library. Altered definitions in the library will replace previous versions of the definition.

library path

How to augment the library path by calling a function or manipulating a global variable.

library path environment variable

How to augment the library path by setting an environment variable before invoking the interpreter.

library path command line option

How to augment the library path by providing a command line option when invoking the interpreter.

main in library

How to put code in a library which executes when the file is run as a top-level script and not when the file is loaded as a library.

module declaration

How to declare a section of code as belonging to a module.

submodule declaration

How to declare a section of code as belonging to a submodule.

module separator

The punctuation used to separate the labels in the full name of a submodule.

import all definitions in module

How to import all the definitions in a module.

import definitions

How to import specific definitions from a module.

managing multiple installations

How to manage multiple versions of the interpreter on the same machine; how to manage multiple versions of 3rd party libraries for the interpreter.

The examples show how to (1) create an installation, (2) enter the environment, (3) display the current environment, and (4) exit the environment.

While in the environment executing the interpreter by its customary name will invoke the version of the interpreter specified when the environment was created. 3rd party libraries installed when in the environment will only be available to processes running in the environment.

python:

virtualenv can be downloaded and installed by running this in the virtualenv source directory:

 
sudo python setup.py install

When virtualenv is run it creates a bin directory with copies of the the python executable, pip, and easy_install. When the activate script is sourced the bin directory is appended to the front of the PATH environment variable.

By default the activate script puts the name of the environment in the shell prompt variable PS1. A different name can be provided with the --prompt flag when virtualenv is run. To remove the name completely it is necessary to edit the activate script.

list installed packages, install a package

How to show the installed 3rd party packages, and how to install a new 3rd party package.

python

Two ways to list the installed modules and the modules in the standard library:

 
$ pydoc modules
 
$ python
>>> help('modules')

Most 3rd party Python code is packaged using distutils, which is in the Python standard library. The code is placed in a directory with a setup.py file. The code is installed by running the Python interpreter on setup.py:

package specification format

The format of the file used to specify a package.

python:

distutils.core reference

Here is an example of how to create a Python package using distutils. Suppose that the file foo.py contains the following code:

 
def add(x, y):
    return x+y

In the same directory as foo.py create setup.py with the following contents:

 
#!/usr/bin/env python
 
from distutils.core import setup
 
setup(name='foo',
      version='1.0',
      py_modules=['foo'],
     )

Create a tarball of the directory for distribution:

 
$ tar cf foo-1.0.tar foo
$ gzip foo-1.0.tar

To install a tar, perform the following:

 
$ tar xf foo-1.0.tar.gz
$ cd foo
$ sudo python setup.py install

If you want people to be able to install the package with pip, upload the tarball to the Python Package Index.

Object Footnotes

define class

perl:

The sheet shows how to create objects using the CPAN module Moose. To the client of an object, Moose objects and traditional Perl objects are largely indistinguishable. Moose provides convenience functions to aid in the definition of a class, and as a result a Moose class definition and a traditional Perl class definition look quite different.

The most common keywords used when defining a Moose class are has, extends, subtype.

The before, after, and around keywords are used to define method modifiers. The with keyword indicates that a Moose class implements a role.

The no Moose; statement at the end of a Moose class definition removes class definition keywords, which would otherwise be visible to the client as methods.

Here is how to define a class in the traditional Perl way:

 
package Int;
 
sub new {
  my $class = shift;
  my $v = $_[0] || 0;
  my $self = {value => $v};
  bless $self, $class;
  $self;
}
 
sub value {
  my $self = shift;
  if ( @_ > 0 ) {
    $self->{'value'} = shift;
  }
  $self->{'value'};
}
 
sub add {
  my $self = shift;
  $self->value + $_[0];
}
 
sub DESTROY {
  my $self = shift;
  my $v = $self->value;
  print "bye, $v\n";
}

python:

As of Python 2.2, classes are of two types: new-style classes and old-style classes. The class type is determined by the type of class(es) the class inherits from. If no superclasses are specified, then the class is old-style. As of Python 3.0, all classes are new-style.

New-style classes have these features which old-style classes don't:

create object

How to create an object.

get and set attribute

How to get and set an attribute.

python:

Defining explicit setters and getters in Python is considered poor style. If it becomes necessary to extra logic to attribute, this can be achieved without disrupting the clients of the class by creating a property:

 
def getValue(self):
  print("getValue called")
  return self.__dict__['value']
def setValue(self,v):
  print("setValue called")
  self.__dict__['value'] = v
value = property(fget=getValue, fset = setValue)

instance variable accessibility

How instance variable access works.

define method

How to define a method.

invoke method

How to invoke a method.

destructor

How to define a destructor.

python:

A Python destructor is not guaranteed to be called when all references to an object go out of scope, but apparently this is how the CPython implementations work.

method missing

How to handle when a caller invokes an undefined method.

python:

__getattr__ is invoked when an attribute (instance variable or method) is missing. By contrast, __getattribute__, which is only available in Python 3, is always invoked, and can be used to intercept access to attributes that exist. __setattr__ and __delattr__ are invoked when attempting to set or delete attributes that don't exist. The del statement is used to delete an attribute.

inheritance

How to use inheritance.

invoke class method

How to invoke a class method.

Reflection Footnotes

object id

How to get an identifier for an object or a value.

inspect type

basic types

inspect class

How to get the class of an object.

inspect class hierarchy

has method?

python:

hasattr(o,'reverse') will return True if there is an instance variable named 'reverse'.

message passing

eval

How to interpret a string as code and return its value.

python:

The argument of eval must be an expression or a SyntaxError is raised. The Python version of the mini-REPL is thus considerably less powerful than the versions for the other languages. It cannot define a function or even create a variable via assignment.

inspect methods

inspect attributes

python:

dir(o) returns methods and instance variables.

pretty print

How to display the contents of a data structure for debugging purposes.

source line number and file name

How to get the current line number and file name of the source code.

Web

http get

How to make an HTTP GET request and read the response into a string.

url encode/decode

How to URL encode and URL unencode a string.

URL encoding, also called percent encoding, is described in RFC 3986. It replaces all characters except for the letters, digits, and a few punctuation marks with a percent sign followed by their two digit hex encoding. The characters which are not escaped are:

 
A-Z a-z 0-9 - _ . ~

URL encoding can be used to encode UTF-8, in which case each byte of a UTF-8 character is encoded separately.

When form data is sent from a browser to a server via an HTTP GET or an HTTP POST, the data is percent encoded but spaces are replaced by plus signs + instead of %20. The MIME type for form data is application/x-www-form-urlencoded.

python:

In Python 3 the functions quote_plus, unquote_plus, quote, and unquote moved from urllib to urllib.parse.

urllib.quote replaces a space character with %20.

urllib.unquote does not replace + with a space character.

base64 encode

How to encode binary data in ASCII using the Base64 encoding scheme.

json

How to encode data in a JSON string; how to decode such a string.

build xml

How to build an XML document.

An XML document can be constructed by concatenating strings, but the techniques illustrated here guarantee the result to be well-formed XML.

parse xml

How to parse XML

xpath

How to extract data from XML using XPath.

Tests

test class

How to define a test class.

run tests; run test method

How to run all the tests in a test class; how to run a single test from the test class.

equality assertion

How to test for equality.

regex assertion

How to test that a string matches a regex.

exception assertion

How to test whether an exception is raised.

setup

How to define a setup method which gets called before every test.

teardown

How to define a cleanup method which gets called after every test.

Debugging and Profiling Footnotes

check syntax

How to check the syntax of code without executing it.

flags for stronger and strongest warnings

Flags to increase the warnings issued by the interpreter.

python:

The -t flag warns about inconsistent use of tabs in the source code. The -3 flag is a Python 2.X option which warns about syntax which is no longer valid in Python 3.X.

lint

A lint tool.

run debugger

How to run a script under the debugger.

debugger commands

A selection of commands available when running the debugger. The gdb commands are provided for comparison.

cmd

perl -d

python -m pdb

rdebug

gdb

help

h

h

h

h

list

l [first, last]

l [first, last]

l [first, last]

l [first, last]

next statement

n

n

n

n

step into function

s

s

s

s

set breakpoint

b

b [file:]line
b function

b [file:]line
b class[.method]

b [file:]line

list breakpoints

L

b

info b

i b

delete breakpoint

B num

cl num

del num

d num

continue

c

c

c

c

show backtrace

T

w

w

bt

move up stack

u

u

u

move down stack

d

down

do

print expression

p expr

p expr

p expr

p expr

(re)run

R

restart [arg1[, arg2 …]]

restart [arg1[, arg2 …]]

r [arg1[, arg2 …]]

quit debugger

q

q

q

q

benchmark code

How to run a snippet of code repeatedly and get the user, system, and total wall clock time.

profile code

How to run the interpreter on a script and get the number of calls and total execution time for each function or method.

Java Interoperation Footnotes

Both Python and Ruby have JVM implementations. It is possible to compile both Python code and Ruby code to Java bytecode and run it on the JVM. It is also possible to run a version of the Python interpreter or the Ruby interpreter on the JVM which reads Python code or Ruby code, respectively.

version

Version of the scripting language JVM implementation used in this reference sheet.

repl

Command line name of the repl.

interpreter

Command line name of the interpreter.

compiler

Command line name of the tool which compiles source to java byte code.

prologue

Code necessary to make java code accessible.

new

How to create a java object.

method

How to invoke a java method.

import

How to import names into the current namespace.

import non-bundled java library

How to import a non-bundled Java library

shadowing avoidance

How to import Java names which are the same as native names.

convert native array to java array

How to convert a native array to a Java array.

are java classes subclassable?

Can a Java class be subclassed?

are java classes open?

Can a Java array be monkey patched?

History

History of Scripting Languages

Scripting the Operating System

Every program is a "script": a set of instructions for the computer to follow. But early in the evolution of computers a need arose for scripts composed not of just of machine instructions but other programs.

IBM introduced Job Control Language (JCL) with System 360 in 1964. Apparently before JCL IBM machines were run by operators who fed programs through the machine one at a time. JCL provided the ability to run a sequence of jobs as specified on punch cards without manual intervention. The language was rudimentary, not having loops or variable assignment, though it did have parametrized procedures. In the body of a procedure a parameter was preceded by an ampersand: &. The language had conditional logic for taking actions depending upon the return code of a previously executed program. The return code was an integer and zero was used to indicate success.

Also in 1964 Louis Pouzin wrote a program called RUNCOM for the CTSS operating system which could run scripts of CTSS commands. Pouzin thought that shells or command line interpreters should be designed with scriptability in mind and he wrote a paper to that effect.

Unix

The first Unix shell was the one Ken Thomson wrote in 1971. It was scriptable in that it supported if and goto as external commands. It did not have assignment or variables.

In the late 1970s the Unix shell scripting landscape came into place. The Bourne shell replaced the Thomson shell in 7th Edition Unix which shipped in 1979. The Bourne shell dispensed with the goto and instead provided an internally implemented if statement and while and for loops. The Bourne shell had user defined variables which used a dollar sign sigil ($) for access but not assignment.

The C-shell also made its appearance in 1979 with the 2nd Berkeley standard distribution of Unix. It was so named because its control structures resembled the control structures of C. The C-shell eventually acquired a bad reputation as a programming environment. Its true contribution was the introduction of job control and command history. Later shells such as the Korn Shell (1982) and the Bourne Again Shell would attempt to incorporate these features in a manner backwardly comptable with the Bourne shell.

Another landmark in Unix shell scripting was awk which appeared in 1977. awk is a specialized language in that there is an implicit loop and the commands are by default executed on every line of input. However, this was a common pattern in the text file oriented environment of Unix.

more IBM developments

The PC made its appearance in 1981. It came with a command interpreter called COMMAND.COM which could run on a batch file. PC-DOS for that matter was patterned closely on CP/M, the reigning operating system of home computers at the time, which itself borrowed from various DEC operating systems such as TOPS-10. I'm not certain whether CP/M or even TOPS-10 for that matter had batch files. As a programming environment COMMAND.COM was inferior to the Unix shells. Modern Windows systems make this programming environment available with CMD.EXE.

IBM released a scripting language called Rexx for its mainframe operating systems in 1982. Rexx was superior as a programming environment to the Unix shells of the time, and in fact Unix didn't have anything comparable until the appearance of Perl and Tcl in the late 1980s. IBM also released versions of Rexx for OS/2 and PC-DOS.

Scripting the Web

The Original HTTP as defined in 1991
HTML Specification Draft June 1993
WorldWideWeb Browser
Mosaic Web Browser

Tim Berners-Lee created the web in 1990. It ran on a NeXT cube. The browser and the web server communicated via a protocol invented for the purpose called HTTP. The documents were marked up in a type of SGML called HTML. The key innovation was the hyperlink. If the user clicked on a hyperlink, the browser would load the document pointed to by the link. A hyperlink could also take the user to a different section of the current document.

The initial version of HTML included these tags:

html, head, title, body, h1, h2, h3, h4, h5, h6, pre, blockquote, b, i, a, img, ul, ol, li, dl, dt, dd

The browser developed by Berners-Lee was called WorldWideWeb. It was graphical, but it wasn't widely used because it only ran on NeXT. Nicola Pellow wrote a text-only browser and ported it to a variety of platforms in 1991. Mosaic was developed by Andreesen and others at NCSA and released in February 1993. Mosaic was the first browser which could display images in-line with text. It was originally released for X Windows, and it was ported to Macintosh a few months later. Ports for the Amiga and Windows were available in October and December of 1993.

CGI and Forms

RFC 3875: CGI Version 1.1 2004
HTML 2.0 1995
NSAPI Programmer's Guide (pdf) 2000
Apache HTTP Server Project
History of mod_perl
FastCGI Specification 1996

The original web permitted a user to edit a document with a browser, provided he or she had permission to do so. But otherwise the web was static. The group at NCSA developed forms so users could submit data to a web server. They developed the CGI protocol so the server could invoke a separate executable and pass form data to it. The separate executable, referred to as a CGI script in the RFC, could be implemented in almost any language. Perl was a popular early choice. What the CGI script writes to standard out becomes the HTTP response. Usually this would contain a dynamically generated HTML document.

HTML 2.0 introduced the following tags to support forms:

form input select option textarea

The input tag has a type attribute which can be one of the following:

text password checkbox radio image hidden submit reset

If the browser submits the form data with a GET, the form data is included in the URL after a question mark (?). The form data consists of key value pairs. Each key is separated from its value by an equals (=), and the pairs are separated from each other by ampersands (&). The CGI protocol introduces an encoding scheme for escaping the preceding characters in the form data or any other characters that are meaningful or prohibited in URLs. Typically, the web server will set a QUERY_STRING environment variable to pass the GET form data to the CGI script. If the browser submits the data with POST, the form data is encoded in the same manner as for GET, but the data is placed in the HTTP request body. The media type is set to application/x-www-form-urlencoded.

Andreesen and others at NCSA joined the newly founded company Netscape, which released a browser in 1994. Netscape also released a web server with a plug-in architecture. The architecture was an attempt to address the fact that handling web requests with CGI scripts was slow: a separate process was created for each request. With the Netscape web server, the equivalent of a CGI script would be written in C and linked in to the server. The C API that the developer used was called NSAPI. Microsoft developed a similar API called ISAPI for the IIS web server.

The NCSA web server had no such plug-in architecture, but it remained the most popular web server in 1995 even though development had come to a halt. The Apache web server project started up that year; it used the NCSA httpd 1.3 code as a starting point and it was the most popular web server within a year. Apache introduced the Apache API, which permitted C style web development in the manner of NSAPI and ISAPI. The Apache extension mod_perl, released in March 1996, was a client of the Apache API. By means of mod_perl an Apache web server could handle a CGI request in memory using an embedded perl interpreter instead of forking off a separate perl process.

Ousterhout on Scripting Languages

Ousterhout wrote an article for IEEE Computer in 1998 which drew a distinction between system programming languages and scripting languages. As examples of scripting languages Ousterhout cited Perl, Python, Texx, Tcl, Visual Basic, and the Unix shells. To Ousterhout the biggest difference between the two classes of language is that system programming languages are strongly typed whereas scripting languages are typeless. Being typeless was in Ousterhout's mind a necessary trait for a scripting language to serve as "glue language" to connect the components of an application written in other languages. Ousterhout also noted that system programming languages are usually compiled whereas scripting langauges are usually interpreted, and he predicted that the relative use of scripting language would rise.

Later Web Developments

HTML Templates

PHP/FI Version 2.0
PHP Usage

Web development with CGI scripts written in Perl was easier than writing web server plug-ins in C. The task of writing Perl CGI scripts was made easier by libraries such as cgi-lib.pl and CGI.pm. These libraries made the query parameters available in a uniform fashion regardless of whether a GET or POST request was being handled and also took care of assembling the headers in the response. Still, CGI scripts tended to be difficult to maintain because of the piecemeal manner in which the response document is assembled.

Rasmus Lerdorf adopted a template approach for maintaining his personal home page. The document to be served up was mostly static HTML with an escaping mechanism for inserting snippets of code. In version 2.0 the escapes were <? code > and <?echo code >. Lerdorf released the code for the original version, called PHP/FI and implemented in Perl, in 1995. The original version was re-implemented in C and version 2.0 was released in 1997. For version 3.0, released in 1998, the name was simplified to PHP. Versions 4.0 and 5.0 were released in 2000 and 2004. PHP greatly increased in popularity with the release of version 4.0. Forum software, blogging software, wikis, and other content management systems (CMS) are often implemented in PHP.

Microsoft added a tempate engine called Active Server Pages (ASP) for IIS in 1996. ASP uses <% code %> and <%= code %> for escapes; the code inside the script could be any number of languages but was usually a dialect of Visual Basic called VBScript. Java Server Pages (JSP), introduced by Sun in 1999, uses the same escapes to embed Java.

MVC Frameworks

The template approach to web development has limitations. Consider the case where the web designer wants to present a certain page if the user is logged in, and a completely unrelated page if the user is not logged in. If the request is routed to an HTML template, then the template will likely have to contain a branch and two mostly unrelated HTML templates. The page that is presented when the user is not logged in might also be displayed under other circumstances, and unless some code sharing mechanism is devised, there will be duplicate code and the maintenance problem that entails.

The solution is for the request to initially be handled by a controller. Based upon the circumstances of the request, the controller chooses the correct HTML template, or view, to present to the user.

Websites frequently retrieve data from and persist data to a database. In a simple PHP website, the SQL might be placed directly in the HTML template. However, this results in a file which mixes three languages: SQL, HTML, and PHP. It is cleaner to put all database access into a separate file or model, and this also promotes code reuse.

The Model-View-Controller design pattern was conceived in 1978. It was used in Smalltalk for GUI design. It was perhaps in Java that the MVC pattern was introduced to web development.

Early versions of Java were more likely to be run in the browser as an applet than in the server. Sun finalized the Servlet API in June 1997. Servlets handled requests and returned responses, and thus were the equivalent of controllers in the MVC pattern. Sun worked on a reference web server which used servlets. This code was donated to the Apache foundation, which used it in the Tomcat webserver, released in 1999. The same year Sun introduced JSP, which corresponds to the view of the MVC pattern.

The Struts MVC framework was introduced in 2000. The Spring MVC framework was introduced in 2002; some prefer it to Struts because it doesn't use Enterprise JavaBeans. Hibernate, introduced in 2002, is an ORM and can serve as the model of an MVC framework.

Ruby on Rails was released in 2004. Ruby has a couple of advantages over Java when implementing an MVC framework. The models can inspect the database and create accessor methods for each column in the underlying table on the fly. Ruby is more concise than Java and has better string manipulation features, so it is a better language to use in HTML templates. Other dynamic languages have built MVC frameworks, e.g. Django for Python.

Python

2.7: Language, Standard Library
Why Python3 Summary of Backwardly Non-compatible Changes in Python 3
3.2: Language, Standard Library
PEP 8: Style Guide for Python Code van Rossum

Python uses leading whitespace to indicate block structure. It is not recommended to mix tabs and spaces in leading whitespace, but when this is done, a tab is equal to 8 spaces. The command line options '-t' and '-tt' will warn and raise an error respectively when tabs are used inconsistently for indentation.

Regular expressions and functions for interacting with the operating system are not available by default and must be imported to be used, i.e.

 
import re, sys, os

Identifiers in imported modules must be fully qualified unless imported with from/import:

 
from sys import path
from re import *

There are two basic sequence types: the mutable list and the immutable tuple. The literal syntax for lists uses square brackets and commas [1,2,3] and the literal syntax for tuples uses parens and commas (1,2,3).

The dictionary data type literal syntax uses curly brackets, colons, and commas { “hello”:5, “goodbye”:7 }. Python 3 adds a literal syntax for sets which uses curly brackets and commas: {1,2,3}. This notation is also available in Python 2.7. Dictionaries and sets are implemented using hash tables and as a result dictionary keys and set elements must be hashable.

All values that can be stored in a variable and passed to functions as arguments are objects in the sense that they have methods which can be invoked using the method syntax.

Attributes are settable by default. This can be changed by defining a __setattr__ method for the class. The attributes of an object are stored in the __dict__ attribute. Methods must declare the receiver as the first argument.

Classes, methods, functions, and modules are objects. If the body of a class, method, or function definition starts with is a string, it is available available at runtime via __doc__. Code examples in the string which are preceded with '>>>' (the python repl prompt) can be executed by doctest and compared with the output that follows.

Java History

Java Version History

Java was developed by James Gosling at Sun and made publicly available in 1996. It is a compiled, objected oriented language with syntax similar to C++. It is perhaps best understood by how it differs from C++:

Compared to C++, the language is easier to use and easier to port. Applets that ran in the browser were an early use of the language that helped popularize it.

Version 1.1 (1997) added RMI and several types of nested classes including anonymous classes. Version 1.2 (1998) added the ability to reflect on the methods of a class or object at runtime. Version 1.4 (2002) added Perl style regular expressions. Version 1.5 (2004) added generics, which are roughly similar to C++ templates, and autoboxing, in which the compiler automatically wraps a primitive type with an instance of a wrapper class when needed.

Over the years Java has developed an extensive standard library; as of version 1.5 the standard library contains 3000 classes. Third parties are encouraged to use their internet domain name to determine the location of their code in the Java code namespace, a technique which makes it easy to integrate code from non-standard sources.

Other languages have targeted the JVM: Jython since 1997, Scala and Groovy since 2003, and Clojure since 2007. JVM languages include interpreters written in Java and languages which can be compiled to bytecode.