avatar

Read a distant file like a stream using Go


17-06-2015 08:28 by depado

Basic Approach : Downloading the whole file in memory

The most basic implementation you could think of to read and use a distant content is to download the whole file in memory. You can do this like that in Go, it's pretty easy.

package main

import (
    "ioutil"
    "log"
    "net/http"
)

func main() {
    resp, err := http.Get("http://mylargefile.com/thefile.txt")
    if err != nil {
        log.Fatalf("Error while getting the url : %v", err)
    }
    defer resp.Body.Close()
    content, err := ioutil.ReadAll(resp.Body)
    if err != nil {
        log.Fatalf("Error while reading the body : %v", err)
    }
    // Do things with the content
}

Pretty easy. Now this approach will work in most of the case. There are a few drawbacks though, as for example, performance. You have to actually wait for the whole file to be downloaded before doing anything with its content. And what happens if the file is really large ? Like, bigger-than-your-total-ram large ? The solution is simple : Read the distant file's content like a stream using a scanner.

Stream Approach

package main

import (
    "bufio"
    "log"
    "net/http"
)

func main() {
    resp, err := http.Get("http://mylargefile.com/thebiggestfileever.txt")
    if err != nil {
        log.Fatalf("Error while getting the url : %v", err)
    }
    defer resp.Body.Close()
    scanner := bufio.NewScanner(resp.Body)
    for scanner.Scan() {
        // Do something with each line
    }
}

Now as you can see, the file will be treated line by line by the scanner while the file is being downloaded, thus saving memory and improving performances. It's not even more complex to use than the basic approach and it will save some execution time.