Your colleagues' suggestions are very wrong; that would be incredibly slow if you are reading a lot of data. I use a mix of a complex regular expression and quote counting to read CSV files. The quote counting is because one cell could contain multiple lines. For each row of csv I keep reading lines of text until I have a string with an even number of double quotes.
I can't remember where I got it from now, but this is the expression: "(?:^|,)(\\\"(?:[^\\\"]+|\\\"\\\")*\\\"|[^,]*)" Each match gives you one field, from which you may also need to remove outer quotes, and then change double double quotes to single double quotes.
Comments 3
I can't remember where I got it from now, but this is the expression:
"(?:^|,)(\\\"(?:[^\\\"]+|\\\"\\\")*\\\"|[^,]*)"
Each match gives you one field, from which you may also need to remove outer quotes, and then change double double quotes to single double quotes.
Reply
As regards that RegEx, would it deal with this?
This,is,a,"perfectly
valid bit of ""CSV""",representing,"a,
single,record,",of,nine,fields
Reply
Reply
Leave a comment