CSV module for Python

News

29 July 2003 - CSV module included in Python 2.3

The CSV module has been included as a standard module in Python 2.3 or later. The module API was improved during the merge process.

20 November 2002 - csv-1.0 released

No problems were reported with the 1.0pre1 release. This release makes the pre-release module official.

8 November 2002 - csv-1.0pre1 released

The most important change for this release is that the basic quote handling has been modified to exactly match the observed behaviour of Excel. A comprehensive test suite has been added to test this observed behaviour. If the module does not conform to Excel behaviour we consider this to be a bug.

The pre designation on the release is due to the slight change in quote handling.

Detailed changes in this release are:

  1. Moved to standard BSD template license.
  2. Now using distutils to create source distribution.
  3. Parser's handling of unusual quoting styles was found to be at odds with Excel. In particular, Excel only treats the quote character as special if it appears as the first character in a field, whereas our parser honoured them anywhere. We now produce the same result as Excel - quotes within a field appear in the output.
  4. Introduced Parser.quote_char attribute to replace the hard coded " (double quote). You can now disable quoting by setting quote_char to 0 or None.
  5. Introduced Parser.escape_char attribute which is used to escape special characters when quote_char is specified is disabled.
  6. Introduced Parser.strict attribute which controls whether the parser will raise an exception on malformed fields rather than attempting to guess the right behavior.
  7. Introduced a suite of unit tests.
6 November 2001 - csv-0.5 released

This is a bugfix release.

The changes in this release are:

  1. Fixed bug in memory allocation of internal parser buffer - thanks to John Machin for pointing this out.
  2. Fixed compile warning on Solaris - thanks to Adam Goucher for reporting this.
12 July 2001 - csv-0.4 (John Machin release) released

This is a bugfix release. The changes in this release are:

  1. Exception raising was leaking the error message. Thanks to John Machin for fixing this.
  2. When a parsing exception is raised during parse(), the parser will automatically call clear() discard accumulated fields and state the next time you call parse().

    The old behaviour can be restored either by passing zero as the auto_clear constructor keyword argument, or by setting the auto_clear parser attribute to zero.

    As well as raising an exception, a parsing error will also set the readonly parser attribute had_parse_error to 1. This is reset next time you call parse() or clear().

    Thanks again to John Machin for suggesting this.

  3. An obscure parsing bug has been fixed.

    The old behaviour:

    >>> p.parse('12,12,1",')
    ['12', '12', '1",']
    >>>

    The new behaviour:

    >>> p.parse('12,12,1",')
    >>> p.parse('12,12,1",')
    ['12', '12', '1"', '']
    >>>

    I am still of two minds about whether I should raise an exception when I encounter text like that...

17 June 2001 - csv-0.3 released
Module updated to use distutils - should now be easy to build for other platforms.
28 January 2001 - csv-0.2 released

With all of this CSV parsing discussion, I received a couple of requests for enhancements to my fast CSV module. The enhancements are:

  • Parser object now has the field_sep attribute. This is the character which is used to delimit fields in records.

    I found (and reported) a bug in the Python PyMember_Set() function while adding this feature.

  • Parser object now has the join() method which combines the elements in a sequence and returns a CSV record in a string.