Skip to content

Reading and Writing Files

putChar, putStr, putStrLn, getChar, getLine, and getContents have counterparts whose names are prefixed with an h. The h-versions of these functions read from a file handle (h for handle). In fact, if you import the System.IO module, then you get access to the standard input/output streams available to any program:

stdin, stdout, stderr :: Handle

putChar, putStr, and putStrLn are defined by calling hPutChar, hPutStr, and hPutStrLn with stdout as their first arguments:

putChar  = hPutChar  stdout
putStr   = hPutStr   stdout
putStrLn = hputStrLn stdout

Similarly, getChar, getLine, and getContents read from stdin using hGetChar, hGetLine, and hGetContents:

getChar     = hGetChar     stdin
getLine     = hGetLine     stdin
getContents = hGetContents stdin

The types of hPutChar, hPutStr, hPutStrLn, hGetChar, hGetLine, and hGetContents are

hPutChar     :: Handle -> Char   -> IO ()
hPutStr      :: Handle -> String -> IO ()
hPutStrLn    :: Handle -> String -> IO ()
hGetChar     :: Handle           -> IO Char
hGetLine     :: Handle           -> IO String
hGetContents :: Handle           -> IO String

A Handle is what we use in Haskell to hold on to an open file from which we can read and to which we can write. stdin, stdout, and stderr are files that are always available, without the need to open them. If we want to read from a file on our file system, then we need to open this file using

openFile :: FilePath -> IOMode -> IO Handle

FilePath is the type used to represent file names. It's an alias for String.

type FilePath = String

IOMode indicates whether we want to open the file for reading, writing, both, or appending.

data IOMode
    = ReadMode       -- File can only be read
    | WriteMode      -- File can only be written
    | AppendMode     -- File can only be written past its original content
    | ReadWriteMode  -- File can be read and written

In particular, opening a file in WriteMode erases its original content. ReadMode, WriteMode, and ReadWriteMode then position the file cursor at the beginning of the file. AppendMode does not alter the file when opening it and positions the file cursor at the end of the file, to allow adding more content at the end.

A file opened with openFile can be closed using

hClose :: Handle -> IO ()

For example, if we have the file

mantra.txt
I will learn to program in Haskell!
I will learn to program in Haskell!
I will learn to program in Haskell!

then we can open it using openFile, read its contents using hGetContents, print them on screen using putStr, and finally close the file using hClose:

GHCi
>>> import System.IO
>>> :{
  | do
  |     h <- openFile "mantra.txt" ReadMode
  |     txt <- hGetContents h
  |     putStr txt
  |     hClose h
  | :}
I will learn to program in Haskell!
I will learn to program in Haskell!
I will learn to program in Haskell!

An important point we will explore more closely shortly: In other languages, this would work just fine:

GHCi
>>> :{
  | do
  |     h <- openFile "mantra.txt" ReadMode
  |     txt <- hGetContents h
  |     hClose h
  |     putStr txt
  | :}
*** Exception: mantra.txt: hGetContents: illegal operation (delayed read on closed handle)

Apparently, GHCi doesn't like what we're doing. We were trying to read the contents of the file into txt using hGetContents. Being good programmers who clean up after themselves as soon as possible, we closed the handle h, and then we printed the file contents using putStr.

The reason why this doesn't work so well is Haskell's lazy evaluation. In most languages, if we read the entire file using the equivalent of hGetContents, we get a string object that stores the contents of the whole file. In Haskell, we also get a String, but a String is a list of Characters, and this list is produced lazily, as we consume its characters. Most of the time, this is a good thing because it avoids the dance necessary in other languages, where we read large files in chunks and process these chunks one at a time to avoid using too much memory. In Haskell, we read the file contents as we need them, and the garbage collector disposes of data read earlier unless we hold on to it in a variable.

Here, lazy evaluation bit us in our behind. We closed the file using hClose before we retrieved all its contents: We didn't inspect txt in any way before closing the file, so no data was read before closing the file. Then we tried to print txt, so the runtime system tried to retrieve the contents of h, and it refused to do this because h was already closed. We'll explore this more closely in the next subsection.

You can avoid these pitfalls of lazy I/O by simply never calling hClose. A file handle gets closed automatically when the whole file contents have been read or when the handle gets garbage collected. The only reason why you may want to call hClose is if your program reads or writes lots of files in rapid succession. In that case, the garbage collector may not be fast enough to collect all the handles of previously opened files before you open new files, and this may lead to your program running out of file handles.1

Before having a closer look at lazy I/O, let's discuss four more useful functions for reading and writing files. First, there's

withFile :: FilePath -> IOMode -> (Handle -> IO r) -> IO r

In Java, a common idiom to read a file is

BufferedReader br = new BufferedReader(new FileReader("file.txt"));
try {
    // Do something with br
} finally {
    br.close();
}

The key point is that things can go wrong while reading from br. This throws an exception. By closing br in a finally clause, the file gets closed whether we read it successfully or encountered an exception.

withFile behaves exactly the same. The first argument is a file path, the second argument is the mode in which we want to open the file. The third argument is what you can think of as the body of the try-block in the equivalent Java code. It's a function that takes a handle to the file opened by withFile and produces some value of type r. withFile runs this function on the file it opened and returns the result this function produces, but before doing so, it closes the file. So our little code example above can be expressed more idiomatically as

GHCi
>>> :{
  | withFile "mantra.txt" ReadMode $ \h -> do
  |     txt <- hGetContents h
  |     putStr txt
  | :}
I will learn to program in Haskell!
I will learn to program in Haskell!
I will learn to program in Haskell!

We need to make sure that we consume the contents of the file within the withFile block. Otherwise, the file gets closed before we are finished reading it, and we run into the same problem as before:

GHCi
>>> :{
  | do
  |     txt <- withFile "mantra.txt" ReadMode hGetContents
  |     putStr txt
  | :}
*** Exception: mantra.txt: hGetContents: illegal operation (delayed read on closed handle)

The fact that file handles get closed automatically once the whole file has been read or the handle gets garbage collected makes it less necessary to use withFile to ensure files get closed unless we need to carefully manage our program's open file handles.

For reading the contents of a whole file or writing the contens of a whole file in one go, we have the following convenient functions:

readFile  :: FilePath ->           IO String
writeFile :: FilePath -> String -> IO ()

readFile reads the whole file and returns its contents in a string. writeFile writes the given string to the file. No need to fiddle with file handles. readFile is lazy, as can be seen from its implementation:

readFile name = openFile name ReadMode >>= hGetContents

Thus, it relies on the file getting closed eventually once the contents have been read or the handle gets garbage collected. Thus, we cannot use readFile if we need to manage open file handles carefully.

You can use our old friend :sprint to verify that readFile is indeed lazy:

GHCi
>>> txt <- readFile "mantra.txt"
>>> :sprint txt
txt = _
>>> putStr txt
I will learn to program in Haskell!
I will learn to program in Haskell!
I will learn to program in Haskell!
>>> :sprint txt
txt = "I will learn to program in Haskell!
I will learn to program in Haskell!
I will learn to program in Haskell!
"

Note the output of the first :sprint txt. It says that txt = _, that is, txt has not been evaluated at all. Printing txt using putStr txt forces txt to be evaluated fully, that is, to be read fully from the input file. After that, :sprint txt prints the complete contents of the file.

writeFile is not lazy because there is no need to keep the file open: It consumes the string to be written fully, and then it is done writing to the file. Thus, the implementation of writeFile uses withFile to make sure that the file gets closed immediately before writeFile returns:

writeFile name txt = withFile name WriteMode (\h -> hPutStr h txt)

The final function I should mention here is hSeek. When reading a file whose contents are effectively a data structure, it is often necessary to follow pointers between different file locations. Thus, the file is no longer read sequentially. This is enabled using

hSeek :: Handle -> SeekMode -> Integer -> IO ()

This behaves pretty much exactly as the fseek function in C. It allows the file cursor to be moved. The third argument, an Integer, is the position to which to set the file cursor. The SeekMode argument determines how to interpret this position:

data SeekMode
    = AbsoluteSeek    -- Position is relative to start of the file
    | RelativeSeek    -- Position is relative to the current cursor position
    | SeekFromEnd     -- Position is relative to the end of the file

So a position of 5 with AbsoluteSeek as the SeekMode places the cursor on the 5th byte of the file. With RelativeSeek as the SeekMode, the new position is 5 bytes after the current position. A position of -5 with SeekFromEnd as the SeekMode positions the cursor on the 5th byte from the end of the file.

GHCi
>>> :{
  | conts file = do
  |     hSeek file AbsoluteSeek 0
  |     go
  |   where
  |     go = do
  |         atEnd <- hIsEOF file
  |         if atEnd then
  |             return []
  |         else do
  |             x <- hGetChar file
  |             xs <- go
  |             return (x:xs)
  | :}
>>> :{
  | do
  |     file <- openFile "mantra.txt" ReadMode
  |     txt1 <- conts file
  |     txt2 <- conts file
  |     putStr $ txt1 ++ txt2
  | :}
I will learn to program in Haskell!
I will learn to program in Haskell!
I will learn to program in Haskell!
I will learn to program in Haskell!
I will learn to program in Haskell!
I will learn to program in Haskell!

I had to cook my own function conts to read the contents of a file here. That's because hGetContents file puts the file handle file into a "semi-closed" state. This state allows further data to be retrieved from the handle by inspecting the lazy list returned by hGetContents, but no other file operations are permitted anymore. conts does not put the handle into a semi-closed state, so we are able to call conts on the same handle a second time. conts resets the file cursor to the beginning of the file and then reads the characters in the file one character at a time.

There are many more functions for working with files and file handle. Hoogle System.IO to learn about which ones there are. You won't need any I/O functions not discussed here for any project in this course, but if you decide to continue programming in Haskell, it will help to have an overview of the different functions offered by the standard library.


  1. On most operating systems, file handles are a limited resource. A program can never have more than a certain number of handles open at any given time. The default limit on Linux systems is 1024, but this can be changed by the system administrator. The default limit on MacOS is 256. I don't have a Windows system to test the limit there.