Training, Open Source computer languages
PerlPHPPythonMySQLApache / TomcatTclRubyJavaC and C++LinuxCSS 
Search for:
Home Accessibility Courses Diary The Mouth Forum Resources Site Map About Us Contact
Python, Lua and Tcl - public course schedule [here]
Private courses on your site - see [here]
Please ask about maintenance training for Perl, PHP, Java, C, C++, Ruby, MySQL and Linux / Tomcat systems
Huge files in Python - over 4 Gbytes

Posted by admin (Graham Ellis), 19 August 2004
Looking back just a few years, a file in excess of 4 Gbytes was unthinkable and files (or even) file systems were limited to 2^32 (2 to the power 32) bytes.  These days, though, a file in excess of 4 Gb is perfectly possible on most (but not all) file systems and can be handled by most (but not all) languages.

In the last couple of days, I was asked about huge files in Python - rumours of problems were reported - and I wrote the following and tested it just fine to extract every millionth line from a 6.9 Gb file.


huge = open("huge.txt")
count = 0

for line in huge.xreadlines():
   count += 1
   if not (count % 1000000):
                print str(count)+" "+line

Note - any construct that reads the whole of the file into memory at one do is going to fail ... that's why I chose xreadlines.

This page is a thread posted to the opentalk forum at and archived here for reference. To jump to the archive index please follow this link.

You can Add a comment or ranking to this page

© WELL HOUSE CONSULTANTS LTD., 2018: Well House Manor • 48 Spa Road • Melksham, Wiltshire • United Kingdom • SN12 7NY
PH: 01225 708225 • FAX: 01225 793803 • EMAIL: • WEB: • SKYPE: wellho