Training, Open Source computer languages
PerlPHPPythonMySQLApache / TomcatTclRubyJavaC and C++LinuxCSS 
Search for:
Home Accessibility Courses Diary The Mouth Forum Resources Site Map About Us Contact
 
This week, we're updating our course layouts and descriptions. Presentation and materials always gently change over time, but just occasionally there's a need to make a step change to clear out some of the old and roll in the new. That's now happening - but over a long and complex site it's not instant and you'll see sections of the site changing up to and including 19th September.

See also [here] for status update
 
Huge files in Python - over 4 Gbytes

Posted by admin (Graham Ellis), 19 August 2004
Looking back just a few years, a file in excess of 4 Gbytes was unthinkable and files (or even) file systems were limited to 2^32 (2 to the power 32) bytes.  These days, though, a file in excess of 4 Gb is perfectly possible on most (but not all) file systems and can be handled by most (but not all) languages.

In the last couple of days, I was asked about huge files in Python - rumours of problems were reported - and I wrote the following and tested it just fine to extract every millionth line from a 6.9 Gb file.

Code:
#/usr/bin/python

huge = open("huge.txt")
count = 0

for line in huge.xreadlines():
   count += 1
   if not (count % 1000000):
                print str(count)+" "+line


Note - any construct that reads the whole of the file into memory at one do is going to fail ... that's why I chose xreadlines.



This page is a thread posted to the opentalk forum at www.opentalk.org.uk and archived here for reference. To jump to the archive index please follow this link.

You can Add a comment or ranking to this page

© WELL HOUSE CONSULTANTS LTD., 2014: Well House Manor • 48 Spa Road • Melksham, Wiltshire • United Kingdom • SN12 7NY
PH: 01144 1225 708225 • FAX: 01144 1225 899360 • EMAIL: info@wellho.net • WEB: http://www.wellho.net • SKYPE: wellho