Providence Salumu
http://www.haskell.org/tutorial/).Create a file called hello.hs with the following contents:
main = putStrLn "Hello, world!"Compile your program to a native executable like this:
$ ghc --make hello
[1 of 1] Compiling Main ( hello.hs, hello.o )
Linking hello ...
$ ./hello
Hello, world!
Or run it in the GHCI interpreter like this:
$ ghci hello.hs
GHCi, version 7.0.3: http://www.haskell.org/ghc/ :? for help
...
Ok, modules loaded: Main.
*Main> main
Hello, world!
*Main>
Haskell uses the = sign to declare bindings:
x = 2 -- Two hyphens introduce a comment
y = 3 -- ...that continues to end of line.
main = let z = x + y -- let introduces local bindings
in print z -- program will print 5
;", which is usually auto-inserted by a layout ruleadd arg1 arg2 = arg1 + arg2 -- defines function add
five = add 2 3 -- invokes function addParentheses can wrap compound expressions, must do so for arguments
bad = print add 2 3 -- error! (print should have only 1 argument)
main = print (add 2 3) -- ok, calls print with 1 argument, 5x = 5
x = 6 -- error, cannot re-bind x
safeDiv x y =
let q = div x y -- safe as q never evaluated if y == 0
in if y == 0 then 0 else q
main = print (safeDiv 1 0) -- prints 0
x = 5 -- this x is not used in main
main = let x = x + 1 -- introduces new x, defined in terms of itself
in print x -- program "diverges" (i.e., loops forever)In C, we use mutable variables to create loops:
long factorial (int n)
{
long result = 1;
while (n > 1)
result *= n--;
return result;
}In Haskell, can use recursion to "re-bind" argument symbols in new scope
factorial n = if n > 1
then n * factorial (n-1)
else 1
This Haskell code requires n stack frames
factorial n = if n > 1 then n * factorial (n-1) else 1factorial n multiplies by n after evaluating factorial (n-1)Idea: use accumulator argument to make calls tail recursive
factorial n = let loop acc n' = if n' > 1
then loop (acc * n') (n' - 1)
else acc
in loop 1 n
loop is tail recursive, compiles to an actual loopwhere clausesGuards let you shorten function declarations:
factorial n = let loop acc n' | n' > 1 = loop (acc * n') (n' - 1)
| otherwise = acc
in loop 1 n
|" symbol introduces a guardTrue guard winsotherwise = TrueBindings can also end with where clauses--like inverted let
factorial n = loop 1 n
where loop acc n' | n' > 1 = loop (acc * n') (n' - 1)
| otherwise = acc
let, a where clause scopes over multiple guarded definitionsloop) often have arguments related to outer function
' (prime) to the inner-function's argument' character in variables, except as first characterfactorial n = loop 1 n
where loop acc n' | n' > 1 = loop (acc * n) (n' - 1) -- bug
| otherwise = acc
factorial n0 = loop 1 n0
where loop acc n | n > 1 = loop (acc * n) (n - 1)
| otherwise = acc
factorial n0 = loop 1 n" causes compile errorBool - either True or FalseChar - a unicode code point (i.e., a character)Int - fixed-size integerInteger - an arbitrary-size integerDouble - an IEEE double-precision floating-point number-> type2 - a function from type1 to type2(type1, type2, ..., typeN) - a tuple() - a zero-tuple, pronounced unit (kind of like void in C); there only one value of this type, also written ()You can declare the type of a symbol or expression with ::
x :: Integer
x = (1 :: Integer) + (1 :: Integer) :: Integer
:: has lower precedence than any function operators (including +)Function application happens one argument at a time (a.k.a. "currying")
add :: Integer -> (Integer -> Integer)
add arg1 arg2 = arg1 + arg2
add 2 3 is equivalent to (add 2) 3(add 2) takes 3 returns 5, so (add 2) has type Integer -> Integer-> associates to the right, so parens usually omitted in multi-argument function types:fn :: argType1 -> argType2 -> ... -> argTypeN -> resultType:t*Main> :t add
add :: Integer -> Integer -> Integer
The data keyword declares user-defined data types (like struct in C), E.g.:
data PointT = PointC Double Double deriving Show
PointT, and constructor, PointCPointT contains two Doublesderiving Show means you can print the type (helpful in GHCI)Types and constructors can use the same name (often do), E.g.:
data Point = Point Double Double deriving ShowOne type can have multiple constructors (like a tagged union):
data Point = Cartesian Double Double
| Polar Double Double
deriving Show
data Color = Red | Green | Blue | Indigo | Violet deriving ShowConstructors act like functions producing values of their types
data Point = Point Double Double deriving Show
myPoint :: Point
myPoint = Point 1.0 1.0
data Color = Red | Green | Blue | Indigo | Violet deriving Show
myColor :: Color
myColor = Redcase statements & function bindings "de-construct" values with patterns
getX, getMaxCoord :: Point -> Double
getX point = case point of
Point x y -> x
getMaxCoord (Point x y) | x > y = x
| otherwise = y
isRed :: Color -> Bool
isRed Red = True -- Only matches constructor Red
isRed c = False -- Lower-case c just a variabledata Maybe a = Just a
| Nothing
data Either a b = Left a
| Right bYou can see these at work in GHCI:
Prelude> :t Just True
Just True :: Maybe Bool
Prelude> :t Left True
Left True :: Either Bool b
Left True contains a type variable, b
Left True can be of type Either Bool b for any type b_" can be bound but not used
isJust :: Maybe a -> Bool -- note parametric polymorphism
isJust (Just _) = True
isJust Nothing = False
isRed Red = True
isRed _ = False -- we don't need the non-red value
_ avoids thisYou can deconstruct types and bind variables within guards, E.g.:
addMaybes mx my | Just x <- mx, Just y <- my = Just (x + y)
addMaybes _ _ = Nothing
though often there is a simpler way
addMaybes (Just x) (Just y) = Just (x + y)
addMaybes _ _ = NothingWe could define homogeneous lists with the data keyword
data List a = Cons a (List a) | Nil
oneTwoThree = (Cons 1 (Cons 2 (Cons 3 Nil))) :: List IntegerList Integer, the type is written [Integer]Cons, the constructor is called : and is infixNil, the empty list is called []oneTwoThree = 1:2:3:[] :: [Integer]
oneTwoThree' = [1, 2, 3] -- comma-separated elements within brackets
oneTwoThree'' = [1..3] -- define list by a rangeA String is just a list of Char, so ['a', 'b', 'c'] == "abc"
head :: [a] -> a
head (x:_) = x
head [] = error "head: empty list"
tail :: [a] -> a -- all but first element
tail (_:xs) = xs
tail [] = error "tail: empty list"
a ++ b :: [a] -> [a] -> [a] -- infix operator concatenate lists
[] ++ ys = ys
(x:xs) ++ ys = x : xs ++ ys
length :: [a] -> Int -- This code is from language spec
length [] = 0 -- GHC implements differently, why?
length (_:l) = 1 + length l
filter :: (a -> Bool) -> [a] -> [a]
filter pred [] = []
filter pred (x:xs)
| pred x = x : filter pred xs
| otherwise = filter pred xs
Note function error :: String -> a reports assertion failures
length function?http://www.haskell.org/hoogle/haskell.org" is too long for me--I change to "ho"# marks are for "unboxed types", which are faster but not asymptoticallylen is tail recursiveHere's a function to count lower-case letters in a String
import Data.Char -- brings function isLower into scope
countLowerCase :: String -> Int
countLowerCase str = length (filter isLower str)length, countLowerCase might run in constant space
Recall Haskell evaluates expressions lazily... Means in most contexts values are interchangeable with function pointers (a.k.a. thunks)
A String is a [Char], which is type with two values, a head and tail
But until each of the head or tail are needed, they can be stored as function pointers
So length will causes filter to produce Chars one at a time
length does not hold on to characters once counted; can be garbage-collected at will
Here's an even more concise definition
countLowerCase :: String -> Int
countLowerCase = length . filter isLowerThe "." operator provides function composition
(f . g) x = f (g x)
countLowerCase's argument had name strFunction composition can be used almost like Unix pipelines
process = countLowercase . toPigLatin . extractComments . unCompress\variable(s) -> body (where \ is pronounced "lambda")Example:
countLowercaseAndDigits :: String -> Int
countLowercaseAndDigits = length . filter (\c -> isLower c || isDigit c)Lambda abstractions can deconstruct values with patterns, e.g.:
... (\(Right x) -> x) ...
+, *, /, ., ||, :_, and '
add 1 21 `add` 2!#$%&*+./<=>?@\^|-~ or constructors starting ":"
(+) 1 2(,), (,,), (,,,), (,,,,), etc.Infix functions can be partially applied in a parenthesized section
stripPunctuation :: String -> String
stripPunctuation = filter (`notElem` "!#$%&*+./<=>?@\\^|-~:")
-- Note above string the SECOND argument to notElem ^.., :, ::, =, \, |, <-, ->, @, ~, =>, --)infixl/infixr/infix for left/right/no associativityelse clauses, and let...in clauses extend as far to the right as possible (meaning they never stop at any infix operator, no matter how low precedence)infixl 9 !! -- This is the default when fixity unspecified
infixr 9 .
infixr 8 ^, ^^, ⋆⋆
infixl 7 ⋆, /, `quot`, `rem`, `div`, `mod`
infixl 6 +, - -- Unary negation "-" has this fixity, too
infixr 5 ++ -- built-in ":" constructor has this fixity, too
infix 4 ==, /=, <, <=, >=, >, `elem`, `notElem`
infixr 3 &&
infixr 2 ||
infixl 1 >>, >>=
infixr 1 =<<
infixr 0 $, $!, `seq`
If you can't remember, use :i in GHCI:
Prelude> :i &&
(&&) :: Bool -> Bool -> Bool -- Defined in GHC.Classes
infixr 3 &&
infixl 9infixr 0" operators$ is function application, but with lowest precedence
($) :: (a -> b) -> a -> b
f $ x = f x
putStrLn $ "the value of " ++ key ++ " is " ++ show valueseq :: a -> b -> b evaluates first argument, and second
main = let q = 1 `div` 0
in seq q $ putStrLn "Hello world!\n" -- exception
seq has to be built into the compiler$! combines $ and seq
f $! x = x `seq` f xn0 stack frames in factorial:factorial n0 = loop 1 n0
where loop acc n | n > 1 = loop (acc * n) (n - 1)
| otherwise = acc
acc can contain a chain of thunks n long(((1 * n) * (n - 1)) * (n - 2) ...) -- Laziness means only evaluated when needed$! or seqfactorial n0 = loop 1 n0
where loop acc n | n > 1 = (loop $! acc * n) (n - 1)
| otherwise = acc
factorial n0 = loop 1 n0
where loop acc n | n > 1 = acc `seq` loop (acc * n) (n - 1)
| otherwise = acc
cabal update to create $HOME/.cabal, download package databaseI highly recommend unconmenting and editing these two lines in $HOME/.cabal/config
documentation: True
library-profiling: True
$HOME/.cabal/bin to your pathTo install packages for the next examples, run
cabal install http-enumerator utf8-string tagsoup
$HOME/.cabal, and records them in $HOME/.ghc$HOME/.cabal and $HOME/.ghcimport syntaxMain, as programs start at function main in MainMain, a module named M must reside in a file named M.hsMain modulesLet's add this to the top of our source file
module Main where -- redundant since Main is the default
import qualified Data.ByteString.Lazy.UTF8 as L
import Data.Char
import Network.HTTP.Enumerator (simpleHttp)
import System.Environment
module name where" or "module name (exported-symbol[, ...]) where" (non-exported symbols provide modularity)import module - imports all symbols in moduleimport qualified module as ID - prefixes imported symbols with ID.import module (function1[, function2 ...]) - imports just the named functionsdo notationmain = do
(url:_) <- getArgs -- Sets url to first command-line argument
page <- simpleHttp url -- Sets page to contents as a ByteString
putStr (L.toString page) -- Converts ByteString to String and prints it
do block lets you sequence IO actions. In a do block:
<- action - binds pat (variable or constructor pattern) to result of executing actionlet pat = pure-value - binds pat to pure-value (no "in ..." required)do block (i.e., can use <-, need let for bindings)do/let/case won't parse after prefix function (so say "func $ do ...")main :: IO ()
getArgs :: IO [String]
simpleHttp :: String -> IO L.ByteString -- (really more polymorphic)
putStr :: String -> IO ()
IO is a parameterized type (just as Maybe is parameterized)
IO [String]" means IO action that produces a [String] if executedMaybe, we won't use a constructor for IO, which is somewhat magicWhat if we try to print the first command-line argument as follows?
main = putStr (head getArgs)
head expects type [String], while getArgs is an IO [String]IO [String] to get a [String]
case, because we don't have a constructor for IO... Besides, the order and number of deconstructions of something like putStr matters<- operator in do blocks!do page <- simpleHttp url
putStr (L.toString page)
simpleHttp and putStr return IO actions that can change the world
main action is ever executeddo page <- simpleHttp url
putStr (L.toString page)
do block builds a compound action from other actions
IO a actions to the world, extracting values of type aaurldump$ ghc --make urldump
[1 of 1] Compiling Main ( urldump.hs, urldump.o )
Linking urldump ...
$ ./urldump http://www.scs.stanford.edu/
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN"
...
What if you want to run it in GHCI?
$ ghci ./urldump.hs
Prelude Main>
* before Main means no access to internal symbols (because compiled)Prelude Main> :load *urldump.hs
[1 of 1] Compiling Main ( urldump.hs, interpreted )
Ok, modules loaded: Main.
*Main> withArgs ["http://cs240h.scs.stanford.edu/"] main
Prelude Main> :main "http://cs240h.scs.stanford.edu/"
return functionLet's combine simpleHttp and L.toString into one function
simpleHttpStr :: String -> IO String
simpleHttpStr url = do
page <- simpleHttp url
return (L.toString page) -- result of do block is last actionNote: return is not control flow statement, just a function
return :: a -> IO a
IO do block must have type IO a for some aL.toString returns a String, use return to make an IO Stringdo block, "let x = e" is like "x <- return e" (except recursive)." (fixity infixr 9)Function >>= (pronounced "bind") allows point-free IO composition
(>>=) :: IO a -> (a -> IO b) -> IO b
infixl 1 >>=Let's re-write urldump in point-free style
main = getArgs >>= simpleHttpStr . head >>= putStr
>>= composes left-to-right, while . goes right-to-leftdo blocks are just syntactic sugar for calling >>=
main:main =
getArgs >>= \(url:_) ->
simpleHttp url >>= \page ->
putStr (L.toString page)Some simple file IO functions may be handy for first lab
type FilePath = String -- makes FilePath synonym for String
getContents :: IO String -- read all stdin
readFile :: FilePath -> IO String -- read (whole) file
writeFile :: FilePath -> String -> IO () -- write filemain = readFile "input" >>= writeFile "output"
unsafeInterleaveIO creates thunks that execute IO actions (c.f. more widely used unsafePerformIO, described in [Peyton Jones])id :: a -> a
id x = x
const :: a -> b -> a
const a _ = a
fst :: (a, b) -> a
fst (a, _) = a
snd :: (a, b) -> b
snd (_, b) = b
print a = putStrLn (show a) -- what's the type? a -> IO ()?
show a = ??? -- how to implement?
id :: a -> a just passes the value through1 + 1 and 1.0 + 1.0 compute very different functionsshow converts value to String, depends entirely on input typederiving Show" in declarations)Ad-hoc polymorphic functions are called methods and declared with classes
class MyShow a where
myShow :: a -> StringThe actual method for each type is defined in an instance declaration
data Point = Point Double Double
instance MyShow Point where
myShow (Point x y) = "(" ++ show x ++ ", " ++ show y ++ ")"
What's the type of a function that calls myShow? Ask GHCI:
myPrint x = putStrLn $ myShow x
*Main> :t myPrint
myPrint :: MyShow a => a -> IO ()
(class type-var, ...) =>" at start of type, E.g.:myPrint :: MyShow a => a -> IO ()
sortAndShow :: (Ord a, MyShow a) => [a] -> String
elem :: (Eq a) => a -> [a] -> Bool
elem _ [] = False
elem x (y:ys) = x==y || elem x ys
add :: (Num a) => a -> a -> a
add arg1 arg2 = arg1 + arg2myPrint, you explicitly give it a value of type aa's MyShow instanceLet's say you want to cache result of super-expensive function
superExpensive val = len $ veryExpensive (val :: Int)
where len [] = 0
len (x:xs) = 1 + len xs
cachedResult = superExpensive 5
cachedResult will start as thunk, be executed once, then contain valueLet's think about the types
*Main> :t superExpensive
superExpensive :: Num a => Int -> a
*Main> :t cachedResult
cachedResult :: Integer
superExpensive can return any Num you wantcachedResult :: (Num a) => a?cachedResult into a function, undermining our caching goal!f), rather than a pattern ((x, y), (Just x))f x = ... ok, f = ... not)Num, try Integer then Double (this sequence can be changed with a default declaration)This code will compile
-- Compiler infers: show1 :: (Show x) => x -> String
show1 x = show xBut neither of these will:
show2 = show
show3 = \x -> show x
Add type signatures to functions--a good idea anyway for top-level bindings, and sometimes necessary for let bindings
-- No problem, compiler knows you want ad hoc polymorphism
show2 :: (Show x) => x -> String
show2 = showEq contains '==' and '/=' methods, while Ord contains <, >=, >, <=, etc.Ord instance not also be an Eq instanceOrd declares Eq as a superclass, using a context
class Eq a => Ord a where
(<), (>=), (>), (<=) :: a -> a -> Bool
a <= b = a == b || a < b -- default methods can use superclasses
....Ord dictionary can lookup the Eq dictionaryderiving (Eq, Ord) to data declarationsmyShow for a list of items whose type is of class MyShowinstance (MyShow a) => MyShow [a] where
myShow [] = "[]"
myShow (x:xs) = myShow x ++ ":" ++ myShow xsFunctor is a class for parameterized types onto which you can map functions:
class Functor f where
fmap :: (a -> b) -> f a -> f b
f, rather types f a and f bAn example of a Functor is Maybe:
instance Functor Maybe where
fmap _ Nothing = Nothing
fmap f (Just a) = Just (f a)
GHCi, version 7.0.3: http://www.haskell.org/ghc/ :? for help
Prelude> fmap (+ 1) Nothing
Nothing
Prelude> fmap (+ 1) $ Just 2
Just 3
FunctorsLists are a Functor
[] can be used as a prefix type ("[] Int" means "[Int]") and can be used to declare instancesmap :: (a -> b) -> [a] -> [b]
map _ [] = []
map f (x:xs) = f x : map f xs
instance Functor [] where
fmap = mapIO is a Functor
instance Functor IO where
fmap f x = x >>= return . f
So we could have said:
simpleHttpStr url = fmap L.toString $ simpleHttp url
or, simpler still:
simpleHttpStr = fmap L.toString . simpleHttpWhat happens if you try to make an instance of Functor for Int?
instance Functor Int where -- compilation error
fmap _ _ = error "placeholder"
fmap :: (a -> b) -> Int a -> Int b, but Int not parameterizedInt, Double, ()) directly describes valuesMaybe, [], IO) requires a type parameterEither, (,)), requires two parametersInt, Double, (), etc.)Maybe, IO, etc.)Either)Monad classreturn and >>= are actually methods of a class called Monadclass Monad m where
(>>=) :: m a -> (a -> m b) -> m b
return :: a -> m a
fail :: String -> m a -- called when pattern binding fails
fail s = error s -- default is to throw exception
(>>) :: m a -> m b -> m a
m >> k = m >>= \_ -> k
do blocks for non-IO purposesMonad--invent a new monad, and you can still use much existing codeMaybe monadSystem libraries define a Monad instance for Maybe
instance Monad Maybe where
(Just x) >>= k = k x
Nothing >>= _ = Nothing
return = Just
fail _ = NothingNothing to indicate failure
extractA :: String -> Maybe Int
extractB :: String -> Maybe String
...
parseForm :: String -> Maybe Form
parseForm raw = do
a <- extractA raw
b <- extractB raw
...
return (Form a b ...)
IO threaded WorldNothingSome data types have a large number of fields
-- Argument to createProcess function
data CreateProcess = CreateProcess CmdSpec (Maybe FilePath)
(Maybe [(String,String)]) StdStream StdStream StdStream Bool
Algebraic data types let you label fields (like C structs)
data CreateProcess = CreateProcess {
cmdspec :: CmdSpec,
cwd :: Maybe FilePath,
env :: Maybe [(String,String)],
std_in :: StdStream,
std_out :: StdStream,
std_err :: StdStream,
close_fds :: Bool
}Let's make an algebraic version of our Point class
data Point = Point { xCoord :: Double, yCoord :: Double }data Point = Point { xCoord :: Double, yCoord :: Double }
Can initialize an Algebraic type by naming fields
myPoint = Point { xCoord = 1.0, yCoord = 1.0 }
undefined - a thunk that throws an exceptionCan also pattern-match on any subset of fields
-- Note the pattern binding assigns the variable on the right of =
getX Point{ xCoord = x } = x
As-patterns are handy to bind a variable and pattern simultaneously (with @):
getX' p@Point{ xCoord = x }
| x < 100 = x
| otherwise = error $ show p ++ " out of range"
-- Also works with non-algebraic patterns
getX' p@(Point x _) = ...
processString s@('$':_) = ...
processString s = ...Can use field labels as access functions
getX point = xCoord point
xCoord works anywhere you can use a function of type Point -> DoubleThere is a special syntax for updating one or more fields
setX point x = point { xCoord = x }
setXY point x y = point { xCoord = x, yCoord = y }
Obviously doesn't update destructively, but returns new, modified Point
Very handy to maintain state in tail recursive functions and Monads
A ! before a data field type makes it strict - i.e., can't be thunk
data State = State !Int Int
data AlgState = AlgState { accumulator :: !Int
, otherValue :: Int }
In both cases above, the first Int cannot hold a thunk, but only a value
When initializing an algebraic datatype, it is mandatory to initialize all strict fields (since they cannot hold the undefined thunk).
Data.Map maintains efficient, functional lookup tables
words breaks a String up into a list of whitespace-separated words