Sat 23 Apr

File reading performance in Python

There are a few ways to read a file in Python, some of which are outlined in this page about their relative performance. I am working on a project right now that involves reading large amounts of data from text files, so I repeated the analysis on Python 2.6.6, the version currently shipping with Ubuntu 10.10. I ran three implementations (below) against a file with 1 million lines.

My test script is available here, and the functions I tested are below. Here were my results:

ScriptTime (sec)Lines read per sec
fileread1:0.16955,899,280 lines/sec
fileread2:1.6387610,236 lines/sec
fileread3:0.12787,823,156 lines/sec
def fileread1():
    file = open("test.txt")
    while 1:
        line = file.readlines()
        if not line:
            break
        pass
    file.close()

def fileread2():
    for l in fileinput.input("test.txt"):
        pass

def fileread3():
    file = open("test.txt")
    for l in file:
        pass

Comments