Clojure

Technical Documentation

OVERVIEW

Clojure is a dynamic, general-purpose programming language, combining the approachability and interactive development of a scripting language with an efficient and robust infrastructure for multithreaded programming. Clojure is a compiled language, yet remains completely dynamic – every feature supported by Clojure is supported at runtime. Clojure provides easy access to the Java frameworks, with optional type hints and type inference, to ensure that calls to Java can avoid reflection.

Clojure is a dialect of Lisp, and shares with Lisp the code-as-data philosophy and a powerful macro system. Clojure is predominantly a functional programming language, and features a rich set of immutable, persistent data structures. When mutable state is needed, Clojure offers a software transactional memory system and reactive Agent system that ensure clean, correct, multithreaded designs.

I hope you find Clojure's combination of facilities elegant, powerful, practical and fun to use.

Rich Hickey (author of Clojure and CTO Cognitect)

Rationale

Customers and stakeholders have substantial investments in, and are comfortable with the performance, security and stability of, industry-standard platforms like the JVM. While Java developers may envy the succinctness, flexibility and productivity of dynamic languages, they have concerns about running on customer-approved infrastructure, access to their existing code base and libraries, and performance. In addition, they face ongoing problems dealing with concurrency using native threads and locking. Clojure is an effort in pragmatic dynamic language design in this context. It endeavors to be a general-purpose language suitable in those areas where Java is suitable. It reflects the reality that, for the concurrent programming future, pervasive, unmoderated mutation simply has to go.

Clojure meets its goals by: embracing an industry-standard, open platform - the JVM; modernizing a venerable language - Lisp; fostering functional programming with immutable persistent data structures; and providing built-in concurrency support via software transactional memory and asynchronous agents. The result is robust, practical, and fast.

Clojure has a distinctive approach to state and identity.

Why Clojure?

Why did I write yet another programming language? Basically because I wanted:

  • A Lisp
  • for Functional Programming
  • symbiotic with an established Platform
  • designed for Concurrency
and couldn’t find one. Here’s an outline of some of the motivating ideas behind Clojure.

Lisp is a good thing

  • Often emulated/pillaged, still not duplicated
  • Lambda calculus yields an extremely small core
  • Almost no syntax
  • Core advantage still code-as-data and syntactic abstraction
  • What about the standard Lisps (Common Lisp and Scheme)?
    1. Slow/no innovation post standardization
    2. Core data structures mutable, not extensible
    3. No concurrency in specs
    4. Good implementations already exist for JVM (ABCL, Kawa, SISC et al)
    5. Standard Lisps are their own platforms
  • Clojure is a Lisp not constrained by backwards compatibility
    1. Extends the code-as-data paradigm to maps and vectors
    2. Defaults to immutability
    3. Core data structures are extensible abstractions
    4. Embraces a platform (JVM)

Functional programming is a good thing

  • Immutable data + first-class functions
  • Could always be done in Lisp, by discipline/convention
    1. But if a data structure can be mutated, dangerous to presume it won’t be
    2. In traditional Lisp, only the list data structure is structurally recursive
  • Pure functional languages tend to strongly static type (Not for everyone, or every task)
  • Clojure is a functional language with a dynamic emphasis
    1. All data structures immutable & persistent, supporting recursion
    2. Heterogeneous collections, return types
    3. Dynamic polymorphism

Clojure is a functional programming language. It provides the tools to avoid mutable state, provides functions as first-class objects, and emphasizes recursive iteration instead of side-effect based looping. Clojure is impure, in that it doesn’t force your program to be referentially transparent, and doesn’t strive for 'provable' programs. The philosophy behind Clojure is that most parts of most programs should be functional, and that programs that are more functional are more robust.

First-class functions

fn creates a function object. It yields a value like any other - you can store it in a var, pass it to functions etc.

(def hello (fn [] "Hello world" ))

-> #'user/hello

(hello)

-> "Hello world"

defn is a macro that makes defining functions a little simpler. Clojure supports arity overloading in a single function object, self-reference, and variable-arity functions using &:

;trumped-up example

(defn argcount ([] 0)
([x] 1)
([x y] 2)
([x y & more] (+ (argcount x y) (count more))))

-> #'user/argcount

(argcount)
-> 0
(argcount 1)
-> 1
(argcount 1 2)
-> 2
(argcount 1 2 3 4 5)
-> 5

You can create local names for values inside a function using let. The scope of any local names is lexical, so a function created in the scope of local names will close over their values:

(defn make-adder [x]
(let [y x]
  (fn [z] (+ y z))))
(def add2 (make-adder 2))
(add2 4)
-> 6

Locals created with let are not variables. Once created their values never change!

Immutable Data Structures

The easiest way to avoid mutating state is to use immutable data structures. Clojure provides a set of immutable lists, vectors, sets and maps. Since they can’t be changed, 'adding' or 'removing' something from an immutable collection means creating a new collection just like the old one but with the needed change. Persistence is a term used to describe the property wherein the old version of the collection is still available after the 'change', and that the collection maintains its performance guarantees for most operations. Specifically, this means that the new version can’t be created using a full copy, since that would require linear time. Inevitably, persistent collections are implemented using linked data structures, so that the new versions can share structure with the prior version. Singly-linked lists and trees are the basic functional data structures, to which Clojure adds a hash map, set and vector both based upon array mapped hash tries.

The collections have readable representations and common interfaces:

( let[my-vector [1 2 3 4]
  my-map {:fred "ethel" }
 my-list (list 4 3 2 1)]
(list
  (conj my-vector 5)
  (assoc my-map :ricky "lucy")
  (conj my-list 5)
  ;the originals are intact
  my-vector
  my-map
  my-list))
-> ([1 2 3 4 5] {:ricky "lucy", :fred "ethel"} (5 4 3 2 1) [1 2 3 4] {:fred "ethel"} (4 3 2 1))

Applications often need to associate attributes and other data about data that is orthogonal to the logical value of the data. Clojure provides direct support for this metadata. Symbols, and all of the collections, support a metadata map. It can be accessed with the meta function. Metadata does not impact equality semantics, nor will metadata be seen in operations on the value of a collection. Metadata can be read, and can be printed.

(def v [1 2 3])
(def attributed-v (with-meta v { :source :trusted}))
( :source (meta attributed-v))
-> :trusted
(= v attributed-v)
-> true

Languages and Platforms

VMs, not OSes, are the platforms of the future, providing:

  • Type system
    1. Dynamic enforcement and safety
  • Libraries
    1. Abstract away OSes
    2. Huge set of facilities
    3. Built-in and 3rd-party
  • Memory and other resource management
    1. GC is platform, not language, facility
  • Bytecode + JIT compilation
    1. Abstracts away hardware

Language as platform vs. language + platform

  • Old way - each language defines its own runtime
    1. GC, bytecode, type system, libraries etc
  • New way (JVM, .Net)
    1. Common runtime independent of language

Language built for platform vs language ported-to platform

  • Many new languages still take 'Language as platform' approach
  • When ported, have platform-on-platform issues
    1. Memory management, type-system, threading issues
    2. Library duplication
    3. If original language based on C, some extension libraries written in C don’t come over

Platforms are dictated by clients

  • 'Must run on JVM' or .Net vs 'must run on Unix' or Windows
  • JVM has established track record and trust level
    1. Now also open source
  • Interop with other code required
  • C linkage insufficient these days

Java/JVM is language + platform

  • Not the original story, but other languages for JVM always existed, now embraced by Sun
  • Java can be tedious, insufficiently expressive
    1. Lack of first-class functions, no type inference, etc
  • Ability to call/consume Java is critical

Clojure is the language, JVM the platform

Object Orientation is overrated

Born of simulation, now used for everything, even when inappropriate

    1. Encouraged by Java/C# in all situations, due to their lack of (idiomatic) support for anything else

Mutable stateful objects are the new spaghetti code

    1. Hard to understand, test, reason about
    2. Concurrency disaster

Inheritance is not the only way to do polymorphism

"It is better to have 100 functions operate on one data structure than to have 10 functions operate on 10 data structures." - Alan J. Perlis

Clojure models its data structures as immutable objects represented by interfaces, and otherwise does not offer its own class system.

Many functions defined on few primary data structures (seq, map, vector, set).

Write Java in Java, consume and extend Java from Clojure.

Polymorphism is a good thing

Switch statements, structural matching etc yield brittle systems

Polymorphism yields extensible, flexible systems

Clojure multimethods decouple polymorphism from OO and types

    1. Supports multiple taxonomies
    2. Dispatches via static, dynamic or external properties, metadata, etc

Concurrency and the multi-core future

Immutability makes much of the problem go away

    1. Share freely between threads

But changing state a reality for simulations and for in-program proxies to the outside world

Locking is too hard to get right over and over again

Clojure’s software transactional memory and agent systems do the hard part

In short, I think Clojure occupies a unique niche as a functional Lisp for the JVM with strong concurrency support. Check out some of thefeaturesorget started with Clojure.