csv2xml - The csv file format
There is no standard csv file format, different software applications have started using the file format and read/write csv files differently. Sometimes this is due to bugs in the software, and sometimes this is just to meet the needs of the particular application. Thankfully the majority of applications using csv follow the same standard, it is this standard which I will describe herein.
Back to homepage.
Basic structure
Here are the basic rules for formatting a csv file, then below will be some examples.
- Each line in a csv file simply consists of text strings, seperated by commas.
- Each line is ended either by a CR, or a CRLF.
- If a field in your file contains a comma, or a CR/LF then you need to escape these special characters, you do this by starting the text field with a quote (") and ending it with another quote.
- Because a quote is the escape character, you need to be able to escape the escape character, this is done by placing a quote in front of the quote ("").
Examples
I believe the best way is to learn is by example, so here are some examples of valid csv files.
Example 1 -- Fairly average example
header1,header2,header2 my,cat,hat fish,dog,cat happy,fred,123
Example 2 -- Demonstrates escaping a comma
i went, a cat, fishing is bad, 12 a, b, c, d another, "example here", a, b "string with comma, is here", a, b,c
Example 3 -- Demonstrates escaping a charriage return
more complicated example, second field "escape a CR in this", second field first field, second field
Example 4 -- How do you escape a " character
a, c, e a, "a,c,e", e a, "fred""fish", e
Errata
- Some applications remove trailing and starting whitespace from each text field.
- Some applications use ; to seperate fields, which is odd thing to use in what is supposed to be a comma seperated file!
- Many bad applications forget to escape the CR character, leading to corrupt files.
- If a quote character is not the first character in a text field, do you still need to escape it, some applications do, some dont. To solve this, always quote text fields that have quotes inside them.
Back to homepage.