Back to Json Iterator

The Go Programming Language

_content/talks/2010/go_talk-20100323.html

latest7.6 KB
Original Source

The Go Programming Language

Sydney University March 23, 2010

Go

New

Experimental

Concurrent

Garbage Collected

Systems Language

Hello, world

package main

import "fmt"

func main() {
	fmt.Printf("Hello, 世界\n")
}

Hello, world 2.0

Serving http://localhost:8080/world

package main

import (
	"fmt"
	"http"
)

func handler(c *http.Conn, r *http.Request) { 
	fmt.Fprintf(c, "Hello, %s.", r.URL.Path[1:]) 
}

func main() {
	http.ListenAndServe(":8080",
			http.HandlerFunc(handler))
}

New

It's about two years old:

  • Design started in late 2007
  • Implementation starting to work mid-2008
  • Released as an open source project in November 2009
  • Development continues with an active community

Why invent a new language? Older languages weren't designed for concurrency, but modern software needs it:

  • Large scale, networked computing, such as Google web search
  • Multi-core hardware

New

Older languages are also frustrating on a day-to-day basis

Statically-typed languages (C, C++, Java) have issues:

  • Edit-Compile-Run cycle takes far too long
  • Type hierarchy can hurt as much as it helps

Dynamic languages (Python, JavaScript) fix some issues but introduce others:

  • No compilation means slow code
  • Runtime errors that should be caught statically

Go has the lighter feel of a scripting language but is compiled

New

Large C++ programs (e.g. Firefox, OpenOffice, Chromium) have enormous build times:

On a Mac (OS X 10.5.8, gcc 4.0.1):

  • C: #include <stdio.h> reads 360 lines from 9 files
  • C++: #include <iostream> reads 25,326 lines from 131 files
  • Objective-C: #include <Carbon/Carbon.h> reads 124,730 lines from 689 files
  • We haven't done any real work yet!

In Go: import "fmt" reads one file: 184 lines summarizing 7 packages

New

Compilation demo

Experimental

Go is still unproven

Language is still evolving

Package library is incomplete

Concurrent garbage collection is an active research problem

Reviving forgotten concepts:

  • Go's concurrency is strongly influenced by Communicating Sequential Processes (Hoare, 1978)
  • Go has types and interfaces, but no inheritance. It is arguably more object-oriented than previously mentioned languages, being closer to the original Smalltalk meaning (1970s)

Concurrent

Unix philosophy: write programs that do one thing and do it well

Connect them with pipes:

  • How many lines of test code are there in the Go standard library?
  • find ~/go/src/pkg | grep _test.go$ | xargs wc -l

Unlike other languages, Go makes it easy to:

  • Launch goroutines
  • Connect them with channels

Concurrent

Start a new flow of control with the go keyword

Parallel computation is easy:

func main() {
	go expensiveComputation(x, y, z)
	anotherExpensiveComputation(a, b, c)
}

Roughly speaking, a goroutine is like a thread, but lighter weight:

  • Goroutines have segmented stacks, and typically smaller stacks
  • This requires compiler support. Goroutines can't just be a C++ library on top of a thread library

Concurrent

Consider web servers ("the C10k problem"):

  • "Thread per connection" approach is conceptually neat, but doesn't scale well in practice
  • What does scale well (event-driven callbacks, asynchronous APIs) are harder to understand, maintain, and debug
  • We think "goroutine per connection" can scale well, and is conceptually neat
for {
		rw := socket.Accept()
		conn := newConn(rw, handler)
		go conn.serve()
	}

Concurrent

Let's look again at our simple parallel computation:

func main() {
	go expensiveComputation(x, y, z)
	anotherExpensiveComputation(a, b, c)
}

This story is incomplete:

  • How do we know when the two computations are done?
  • What are their values?

Concurrent

Goroutines communicate with other goroutines via channels

func computeAndSend(ch chan int, x, y, z int) {
	ch <- expensiveComputation(x, y, z)
}

func main() {
	ch := make(chan int)
	go computeAndSend(ch, x, y, z)
	v2 := anotherExpensiveComputation(a, b, c)
	v1 := <-ch
	fmt.Println(v1, v2)
}

Concurrent

In traditional concurrent programs, you communicate by sharing memory. In Go, you share memory by communicating:

  • Communication (the <- operator) is sharing and synchronization

Threads and locks are concurrency primitives; CSP is a concurrency model:

  • Analogy: "Go To Statement Considered Harmful" (Dijkstra, 1968)
  • goto is a control flow primitive; structured programming (if statements, for loops, function calls) is a control flow model

Learning CSP changes the way you think about concurrent programming:

  • Every language has its grain. If your Go program uses mutexes, you're probably working against the grain

Garbage Collected

Automatic memory management makes writing (and maintaining) programs easier

Especially in a concurrent world:

  • Who "owns" a shared piece of memory, and is responsible for destroying it?

Large C++ programs usually end up with semi-automatic memory management anyway, via "smart pointers"

Mixing the two models can be problematic:

  • Browsers can leak memory easily; DOM elements are C++ objects, but JavaScript is garbage collected

Garbage Collected

Go is also a safer language:

  • Pointers but no pointer arithmetic
  • No dangling pointers
  • Variables are zero-initialized
  • Array access is bounds-checked

No buffer overflow exploits

Systems Language

This just means you could write decently large programs in Go:

  • Web servers
  • Web browsers
  • Web crawlers
  • Search indexers
  • Databases
  • Word processors
  • Integrated Development Environments (IDEs)
  • Operating systems
  • ...

Systems Language

Garbage collection has a reputation for being "slower"

We're expecting Go to be slightly slower than optimized C, but faster than Java, depending on the task. Nonetheless:

  • Fast and buggy is worse than almost-as-fast and correct
  • It is easier to optimize a correct program than to correct an optimized program
  • Fundamentally, it's simply a trade-off we're willing to make

Memory layout can drastically affect performance. These two designs are equivalent in Go, but significantly different in Java:

type Point struct { X, Y int }
type Rect struct { P0, P1 Point }

// or ...

type Rect struct { X0, Y0, X1, Y1 int }

Systems Language

Quote from http://loadcode.blogspot.com/2009/12/go-vs-java.html

"[Git] is known to be very fast. It is written in C. A Java version JGit was made. It was considerably slower. Handling of memory and lack of unsigned types was some of the important reasons.

Shawn O. Pearce wrote on the git mailinglist:

  • "JGit struggles with not having an efficient way to represent a SHA-1. C can just say "unsigned char[20]" and have it inline into the container's memory allocation. A byte[20] in Java will cost an *additional* 16 bytes of memory, and be slower to access because the bytes themselves are in a different area of memory from the container object. We try to work around it by converting from a byte[20] to 5 ints, but that costs us machine instructions"

Like C, Go does allow unsigned types and defining data structures containing other data structures as continuous blocks of memory."

Go

New

Experimental

Concurrent

Garbage Collected

Systems Language

And more:

  • I haven't talked about the type system, interfaces, slices, closures, selects, ...
  • Documentation, mailing list, source code all online

Questions?