Providence Salumu
http://www.haskell.org/tutorial/
).Create a file called hello.hs
with the following contents:
main = putStrLn "Hello, world!"
Compile your program to a native executable like this:
$ ghc --make hello
[1 of 1] Compiling Main ( hello.hs, hello.o )
Linking hello ...
$ ./hello
Hello, world!
Or run it in the GHCI interpreter like this:
$ ghci hello.hs
GHCi, version 7.0.3: http://www.haskell.org/ghc/ :? for help
...
Ok, modules loaded: Main.
*Main> main
Hello, world!
*Main>
Haskell uses the =
sign to declare bindings:
x = 2 -- Two hyphens introduce a comment
y = 3 -- ...that continues to end of line.
main = let z = x + y -- let introduces local bindings
in print z -- program will print 5
;
", which is usually auto-inserted by a layout ruleadd arg1 arg2 = arg1 + arg2 -- defines function add
five = add 2 3 -- invokes function add
Parentheses can wrap compound expressions, must do so for arguments
bad = print add 2 3 -- error! (print should have only 1 argument)
main = print (add 2 3) -- ok, calls print with 1 argument, 5
x = 5
x = 6 -- error, cannot re-bind x
safeDiv x y =
let q = div x y -- safe as q never evaluated if y == 0
in if y == 0 then 0 else q
main = print (safeDiv 1 0) -- prints 0
x = 5 -- this x is not used in main
main = let x = x + 1 -- introduces new x, defined in terms of itself
in print x -- program "diverges" (i.e., loops forever)
In C, we use mutable variables to create loops:
long factorial (int n)
{
long result = 1;
while (n > 1)
result *= n--;
return result;
}
In Haskell, can use recursion to "re-bind" argument symbols in new scope
factorial n = if n > 1
then n * factorial (n-1)
else 1
This Haskell code requires n
stack frames
factorial n = if n > 1 then n * factorial (n-1) else 1
factorial n
multiplies by n
after evaluating factorial (n-1)
Idea: use accumulator argument to make calls tail recursive
factorial n = let loop acc n' = if n' > 1
then loop (acc * n') (n' - 1)
else acc
in loop 1 n
loop
is tail recursive, compiles to an actual loopwhere
clausesGuards let you shorten function declarations:
factorial n = let loop acc n' | n' > 1 = loop (acc * n') (n' - 1)
| otherwise = acc
in loop 1 n
|
" symbol introduces a guardTrue
guard winsotherwise = True
Bindings can also end with where
clauses--like inverted let
factorial n = loop 1 n
where loop acc n' | n' > 1 = loop (acc * n') (n' - 1)
| otherwise = acc
let
, a where
clause scopes over multiple guarded definitionsloop
) often have arguments related to outer function
'
(prime) to the inner-function's argument'
character in variables, except as first characterfactorial n = loop 1 n
where loop acc n' | n' > 1 = loop (acc * n) (n' - 1) -- bug
| otherwise = acc
factorial n0 = loop 1 n0
where loop acc n | n > 1 = loop (acc * n) (n - 1)
| otherwise = acc
factorial n0 = loop 1 n
" causes compile errorBool
- either True
or False
Char
- a unicode code point (i.e., a character)Int
- fixed-size integerInteger
- an arbitrary-size integerDouble
- an IEEE double-precision floating-point number->
type2 - a function from type1 to type2(
type1,
type2,
...,
typeN)
- a tuple()
- a zero-tuple, pronounced unit (kind of like void
in C); there only one value of this type, also written ()
You can declare the type of a symbol or expression with ::
x :: Integer
x = (1 :: Integer) + (1 :: Integer) :: Integer
::
has lower precedence than any function operators (including +
)Function application happens one argument at a time (a.k.a. "currying")
add :: Integer -> (Integer -> Integer)
add arg1 arg2 = arg1 + arg2
add 2 3
is equivalent to (add 2) 3
(add 2)
takes 3 returns 5, so (add 2) has type Integer -> Integer
->
associates to the right, so parens usually omitted in multi-argument function types:fn ::
argType1 ->
argType2 ->
... ->
argTypeN ->
resultType:t
*Main> :t add
add :: Integer -> Integer -> Integer
The data
keyword declares user-defined data types (like struct
in C), E.g.:
data PointT = PointC Double Double deriving Show
PointT
, and constructor, PointC
PointT
contains two Double
sderiving Show
means you can print the type (helpful in GHCI)Types and constructors can use the same name (often do), E.g.:
data Point = Point Double Double deriving Show
One type can have multiple constructors (like a tagged union):
data Point = Cartesian Double Double
| Polar Double Double
deriving Show
data Color = Red | Green | Blue | Indigo | Violet deriving Show
Constructors act like functions producing values of their types
data Point = Point Double Double deriving Show
myPoint :: Point
myPoint = Point 1.0 1.0
data Color = Red | Green | Blue | Indigo | Violet deriving Show
myColor :: Color
myColor = Red
case
statements & function bindings "de-construct" values with patterns
getX, getMaxCoord :: Point -> Double
getX point = case point of
Point x y -> x
getMaxCoord (Point x y) | x > y = x
| otherwise = y
isRed :: Color -> Bool
isRed Red = True -- Only matches constructor Red
isRed c = False -- Lower-case c just a variable
data Maybe a = Just a
| Nothing
data Either a b = Left a
| Right b
You can see these at work in GHCI:
Prelude> :t Just True
Just True :: Maybe Bool
Prelude> :t Left True
Left True :: Either Bool b
Left True
contains a type variable, b
Left True
can be of type Either Bool b
for any type b
_
" can be bound but not used
isJust :: Maybe a -> Bool -- note parametric polymorphism
isJust (Just _) = True
isJust Nothing = False
isRed Red = True
isRed _ = False -- we don't need the non-red value
_
avoids thisYou can deconstruct types and bind variables within guards, E.g.:
addMaybes mx my | Just x <- mx, Just y <- my = Just (x + y)
addMaybes _ _ = Nothing
though often there is a simpler way
addMaybes (Just x) (Just y) = Just (x + y)
addMaybes _ _ = Nothing
We could define homogeneous lists with the data
keyword
data List a = Cons a (List a) | Nil
oneTwoThree = (Cons 1 (Cons 2 (Cons 3 Nil))) :: List Integer
List Integer
, the type is written [Integer]
Cons
, the constructor is called :
and is infixNil
, the empty list is called []
oneTwoThree = 1:2:3:[] :: [Integer]
oneTwoThree' = [1, 2, 3] -- comma-separated elements within brackets
oneTwoThree'' = [1..3] -- define list by a range
A String
is just a list of Char
, so ['a', 'b', 'c'] == "abc"
head :: [a] -> a
head (x:_) = x
head [] = error "head: empty list"
tail :: [a] -> a -- all but first element
tail (_:xs) = xs
tail [] = error "tail: empty list"
a ++ b :: [a] -> [a] -> [a] -- infix operator concatenate lists
[] ++ ys = ys
(x:xs) ++ ys = x : xs ++ ys
length :: [a] -> Int -- This code is from language spec
length [] = 0 -- GHC implements differently, why?
length (_:l) = 1 + length l
filter :: (a -> Bool) -> [a] -> [a]
filter pred [] = []
filter pred (x:xs)
| pred x = x : filter pred xs
| otherwise = filter pred xs
Note function error :: String -> a
reports assertion failures
length
function?http://www.haskell.org/hoogle/
haskell.org
" is too long for me--I change to "ho
"#
marks are for "unboxed types", which are faster but not asymptoticallylen
is tail recursiveHere's a function to count lower-case letters in a String
import Data.Char -- brings function isLower into scope
countLowerCase :: String -> Int
countLowerCase str = length (filter isLower str)
length
, countLowerCase
might run in constant space
Recall Haskell evaluates expressions lazily... Means in most contexts values are interchangeable with function pointers (a.k.a. thunks)
A String
is a [Char]
, which is type with two values, a head and tail
But until each of the head or tail are needed, they can be stored as function pointers
So length
will causes filter
to produce Char
s one at a time
length
does not hold on to characters once counted; can be garbage-collected at will
Here's an even more concise definition
countLowerCase :: String -> Int
countLowerCase = length . filter isLower
The ".
" operator provides function composition
(f . g) x = f (g x)
countLowerCase
's argument had name str
Function composition can be used almost like Unix pipelines
process = countLowercase . toPigLatin . extractComments . unCompress
\
variable(s) ->
body (where \
is pronounced "lambda")Example:
countLowercaseAndDigits :: String -> Int
countLowercaseAndDigits = length . filter (\c -> isLower c || isDigit c)
Lambda abstractions can deconstruct values with patterns, e.g.:
... (\(Right x) -> x) ...
+
, *
, /
, .
, ||
, :
_
, and '
add 1 2
1 `add` 2
!#$%&*+./<=>?@\^|-~
or constructors starting ":
"
(+) 1 2
(,)
, (,,)
, (,,,)
, (,,,,)
, etc.Infix functions can be partially applied in a parenthesized section
stripPunctuation :: String -> String
stripPunctuation = filter (`notElem` "!#$%&*+./<=>?@\\^|-~:")
-- Note above string the SECOND argument to notElem ^
..
, :
, ::
, =
, \
, |
, <-
, ->
, @
, ~
, =>
, --
)infixl
/infixr
/infix
for left/right/no associativityelse
clauses, and let
...in
clauses extend as far to the right as possible (meaning they never stop at any infix operator, no matter how low precedence)infixl 9 !! -- This is the default when fixity unspecified
infixr 9 .
infixr 8 ^, ^^, ⋆⋆
infixl 7 ⋆, /, `quot`, `rem`, `div`, `mod`
infixl 6 +, - -- Unary negation "-" has this fixity, too
infixr 5 ++ -- built-in ":" constructor has this fixity, too
infix 4 ==, /=, <, <=, >=, >, `elem`, `notElem`
infixr 3 &&
infixr 2 ||
infixl 1 >>, >>=
infixr 1 =<<
infixr 0 $, $!, `seq`
If you can't remember, use :i
in GHCI:
Prelude> :i &&
(&&) :: Bool -> Bool -> Bool -- Defined in GHC.Classes
infixr 3 &&
infixl 9
infixr 0
" operators$
is function application, but with lowest precedence
($) :: (a -> b) -> a -> b
f $ x = f x
putStrLn $ "the value of " ++ key ++ " is " ++ show value
seq :: a -> b -> b
evaluates first argument, and second
main = let q = 1 `div` 0
in seq q $ putStrLn "Hello world!\n" -- exception
seq
has to be built into the compiler$!
combines $
and seq
f $! x = x `seq` f x
n0
stack frames in factorial
:factorial n0 = loop 1 n0
where loop acc n | n > 1 = loop (acc * n) (n - 1)
| otherwise = acc
acc
can contain a chain of thunks n
long(((1 * n) * (n - 1)) * (n - 2) ...)
-- Laziness means only evaluated when needed$!
or seq
factorial n0 = loop 1 n0
where loop acc n | n > 1 = (loop $! acc * n) (n - 1)
| otherwise = acc
factorial n0 = loop 1 n0
where loop acc n | n > 1 = acc `seq` loop (acc * n) (n - 1)
| otherwise = acc
cabal update
to create $HOME/.cabal
, download package databaseI highly recommend unconmenting and editing these two lines in $HOME/.cabal/config
documentation: True
library-profiling: True
$HOME/.cabal/bin
to your pathTo install packages for the next examples, run
cabal install http-enumerator utf8-string tagsoup
$HOME/.cabal
, and records them in $HOME/.ghc
$HOME/.cabal
and $HOME/.ghc
import
syntaxMain
, as programs start at function main
in Main
Main
, a module named M must reside in a file named M.hs
Main
modulesLet's add this to the top of our source file
module Main where -- redundant since Main is the default
import qualified Data.ByteString.Lazy.UTF8 as L
import Data.Char
import Network.HTTP.Enumerator (simpleHttp)
import System.Environment
module
name where
" or "module
name (
exported-symbol[,
...]) where
" (non-exported symbols provide modularity)import
module - imports all symbols in moduleimport qualified
module as
ID - prefixes imported symbols with ID.
import
module (
function1[,
function2 ...])
- imports just the named functionsdo
notationmain = do
(url:_) <- getArgs -- Sets url to first command-line argument
page <- simpleHttp url -- Sets page to contents as a ByteString
putStr (L.toString page) -- Converts ByteString to String and prints it
do
block lets you sequence IO actions. In a do
block:
<-
action - binds pat (variable or constructor pattern) to result of executing actionlet
pat =
pure-value - binds pat to pure-value (no "in
..." required)do
block (i.e., can use <-
, need let
for bindings)do
/let
/case
won't parse after prefix function (so say "func $ do
...")main :: IO ()
getArgs :: IO [String]
simpleHttp :: String -> IO L.ByteString -- (really more polymorphic)
putStr :: String -> IO ()
IO
is a parameterized type (just as Maybe
is parameterized)
IO [String]
" means IO action that produces a [String]
if executedMaybe
, we won't use a constructor for IO
, which is somewhat magicWhat if we try to print the first command-line argument as follows?
main = putStr (head getArgs)
head
expects type [String]
, while getArgs
is an IO [String]
IO [String]
to get a [String]
case
, because we don't have a constructor for IO
... Besides, the order and number of deconstructions of something like putStr
matters<-
operator in do
blocks!do page <- simpleHttp url
putStr (L.toString page)
simpleHttp
and putStr
return IO
actions that can change the world
main
action is ever executeddo page <- simpleHttp url
putStr (L.toString page)
do
block builds a compound action from other actions
IO a
actions to the world, extracting values of type a
a
urldump
$ ghc --make urldump
[1 of 1] Compiling Main ( urldump.hs, urldump.o )
Linking urldump ...
$ ./urldump http://www.scs.stanford.edu/
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN"
...
What if you want to run it in GHCI?
$ ghci ./urldump.hs
Prelude Main>
*
before Main
means no access to internal symbols (because compiled)Prelude Main> :load *urldump.hs
[1 of 1] Compiling Main ( urldump.hs, interpreted )
Ok, modules loaded: Main.
*Main> withArgs ["http://cs240h.scs.stanford.edu/"] main
Prelude Main> :main "http://cs240h.scs.stanford.edu/"
return
functionLet's combine simpleHttp
and L.toString
into one function
simpleHttpStr :: String -> IO String
simpleHttpStr url = do
page <- simpleHttp url
return (L.toString page) -- result of do block is last action
Note: return
is not control flow statement, just a function
return :: a -> IO a
IO
do block must have type IO a
for some a
L.toString
returns a String
, use return
to make an IO String
do
block, "let x = e
" is like "x <- return e
" (except recursive).
" (fixity infixr 9
)Function >>=
(pronounced "bind") allows point-free IO composition
(>>=) :: IO a -> (a -> IO b) -> IO b
infixl 1 >>=
Let's re-write urldump
in point-free style
main = getArgs >>= simpleHttpStr . head >>= putStr
>>=
composes left-to-right, while .
goes right-to-leftdo
blocks are just syntactic sugar for calling >>=
main
:main =
getArgs >>= \(url:_) ->
simpleHttp url >>= \page ->
putStr (L.toString page)
Some simple file IO functions may be handy for first lab
type FilePath = String -- makes FilePath synonym for String
getContents :: IO String -- read all stdin
readFile :: FilePath -> IO String -- read (whole) file
writeFile :: FilePath -> String -> IO () -- write file
main = readFile "input" >>= writeFile "output"
unsafeInterleaveIO
creates thunks that execute IO
actions (c.f. more widely used unsafePerformIO
, described in [Peyton Jones])id :: a -> a
id x = x
const :: a -> b -> a
const a _ = a
fst :: (a, b) -> a
fst (a, _) = a
snd :: (a, b) -> b
snd (_, b) = b
print a = putStrLn (show a) -- what's the type? a -> IO ()?
show a = ??? -- how to implement?
id :: a -> a
just passes the value through1 + 1
and 1.0 + 1.0
compute very different functionsshow
converts value to String
, depends entirely on input typederiving Show
" in declarations)Ad-hoc polymorphic functions are called methods and declared with classes
class MyShow a where
myShow :: a -> String
The actual method for each type is defined in an instance declaration
data Point = Point Double Double
instance MyShow Point where
myShow (Point x y) = "(" ++ show x ++ ", " ++ show y ++ ")"
What's the type of a function that calls myShow
? Ask GHCI:
myPrint x = putStrLn $ myShow x
*Main> :t myPrint
myPrint :: MyShow a => a -> IO ()
(
class type-var, ...) =>
" at start of type, E.g.:myPrint :: MyShow a => a -> IO ()
sortAndShow :: (Ord a, MyShow a) => [a] -> String
elem :: (Eq a) => a -> [a] -> Bool
elem _ [] = False
elem x (y:ys) = x==y || elem x ys
add :: (Num a) => a -> a -> a
add arg1 arg2 = arg1 + arg2
myPrint
, you explicitly give it a value of type a
a
's MyShow
instanceLet's say you want to cache result of super-expensive function
superExpensive val = len $ veryExpensive (val :: Int)
where len [] = 0
len (x:xs) = 1 + len xs
cachedResult = superExpensive 5
cachedResult
will start as thunk, be executed once, then contain valueLet's think about the types
*Main> :t superExpensive
superExpensive :: Num a => Int -> a
*Main> :t cachedResult
cachedResult :: Integer
superExpensive
can return any Num
you wantcachedResult :: (Num a) => a
?cachedResult
into a function, undermining our caching goal!f
), rather than a pattern ((x, y)
, (Just x)
)f x =
... ok, f =
... not)Num
, try Integer
then Double
(this sequence can be changed with a default
declaration)This code will compile
-- Compiler infers: show1 :: (Show x) => x -> String
show1 x = show x
But neither of these will:
show2 = show
show3 = \x -> show x
Add type signatures to functions--a good idea anyway for top-level bindings, and sometimes necessary for let
bindings
-- No problem, compiler knows you want ad hoc polymorphism
show2 :: (Show x) => x -> String
show2 = show
Eq
contains '==' and '/=' methods, while Ord
contains <
, >=
, >
, <=
, etc.Ord
instance not also be an Eq
instanceOrd
declares Eq
as a superclass, using a context
class Eq a => Ord a where
(<), (>=), (>), (<=) :: a -> a -> Bool
a <= b = a == b || a < b -- default methods can use superclasses
....
Ord
dictionary can lookup the Eq
dictionaryderiving (Eq, Ord)
to data
declarationsmyShow
for a list of items whose type is of class MyShow
instance (MyShow a) => MyShow [a] where
myShow [] = "[]"
myShow (x:xs) = myShow x ++ ":" ++ myShow xs
Functor
is a class for parameterized types onto which you can map functions:
class Functor f where
fmap :: (a -> b) -> f a -> f b
f
, rather types f a
and f b
An example of a Functor
is Maybe
:
instance Functor Maybe where
fmap _ Nothing = Nothing
fmap f (Just a) = Just (f a)
GHCi, version 7.0.3: http://www.haskell.org/ghc/ :? for help
Prelude> fmap (+ 1) Nothing
Nothing
Prelude> fmap (+ 1) $ Just 2
Just 3
Functor
sLists are a Functor
[]
can be used as a prefix type ("[] Int
" means "[Int]
") and can be used to declare instancesmap :: (a -> b) -> [a] -> [b]
map _ [] = []
map f (x:xs) = f x : map f xs
instance Functor [] where
fmap = map
IO
is a Functor
instance Functor IO where
fmap f x = x >>= return . f
So we could have said:
simpleHttpStr url = fmap L.toString $ simpleHttp url
or, simpler still:
simpleHttpStr = fmap L.toString . simpleHttp
What happens if you try to make an instance of Functor
for Int
?
instance Functor Int where -- compilation error
fmap _ _ = error "placeholder"
fmap :: (a -> b) -> Int a -> Int b
, but Int
not parameterizedInt
, Double
, ()
) directly describes valuesMaybe
, []
, IO
) requires a type parameterEither
, (,)
), requires two parametersInt
, Double
, ()
, etc.)Maybe
, IO
, etc.)Either
)Monad
classreturn
and >>=
are actually methods of a class called Monad
class Monad m where
(>>=) :: m a -> (a -> m b) -> m b
return :: a -> m a
fail :: String -> m a -- called when pattern binding fails
fail s = error s -- default is to throw exception
(>>) :: m a -> m b -> m a
m >> k = m >>= \_ -> k
do
blocks for non-IO purposesMonad
--invent a new monad, and you can still use much existing codeMaybe
monadSystem libraries define a Monad
instance for Maybe
instance Monad Maybe where
(Just x) >>= k = k x
Nothing >>= _ = Nothing
return = Just
fail _ = Nothing
Nothing
to indicate failure
extractA :: String -> Maybe Int
extractB :: String -> Maybe String
...
parseForm :: String -> Maybe Form
parseForm raw = do
a <- extractA raw
b <- extractB raw
...
return (Form a b ...)
IO
threaded WorldNothing
Some data types have a large number of fields
-- Argument to createProcess function
data CreateProcess = CreateProcess CmdSpec (Maybe FilePath)
(Maybe [(String,String)]) StdStream StdStream StdStream Bool
Algebraic data types let you label fields (like C struct
s)
data CreateProcess = CreateProcess {
cmdspec :: CmdSpec,
cwd :: Maybe FilePath,
env :: Maybe [(String,String)],
std_in :: StdStream,
std_out :: StdStream,
std_err :: StdStream,
close_fds :: Bool
}
Let's make an algebraic version of our Point
class
data Point = Point { xCoord :: Double, yCoord :: Double }
data Point = Point { xCoord :: Double, yCoord :: Double }
Can initialize an Algebraic type by naming fields
myPoint = Point { xCoord = 1.0, yCoord = 1.0 }
undefined
- a thunk that throws an exceptionCan also pattern-match on any subset of fields
-- Note the pattern binding assigns the variable on the right of =
getX Point{ xCoord = x } = x
As-patterns are handy to bind a variable and pattern simultaneously (with @
):
getX' p@Point{ xCoord = x }
| x < 100 = x
| otherwise = error $ show p ++ " out of range"
-- Also works with non-algebraic patterns
getX' p@(Point x _) = ...
processString s@('$':_) = ...
processString s = ...
Can use field labels as access functions
getX point = xCoord point
xCoord
works anywhere you can use a function of type Point -> Double
There is a special syntax for updating one or more fields
setX point x = point { xCoord = x }
setXY point x y = point { xCoord = x, yCoord = y }
Obviously doesn't update destructively, but returns new, modified Point
Very handy to maintain state in tail recursive functions and Monads
A !
before a data field type makes it strict - i.e., can't be thunk
data State = State !Int Int
data AlgState = AlgState { accumulator :: !Int
, otherValue :: Int }
In both cases above, the first Int
cannot hold a thunk, but only a value
When initializing an algebraic datatype, it is mandatory to initialize all strict fields (since they cannot hold the undefined
thunk).
Data.Map
maintains efficient, functional lookup tables
words
breaks a String
up into a list of whitespace-separated words