When Unicode and value with comma as text is involved, choosing the correct class is important.
Below are some of the classes and their hierarchy.
Class name Class declaration
============== =================================
IO class Io extends Object
+- CommaIo class CommaIo extends Io
| +- TextIo class TextIo extends CommaIo
| + Comma7Io class Comma7Io extends CommaIo
| +- AsciiIo class AsciiIo extends CommaIo
+- CommaTextIo class CommaTextIo extends Io
Given a CSV file with the following line:
,,미지정,외환은행,123-123456-123,외환체크사용,2012/10/05,KRW,,"2,350","11,563,531"
,,미지정,외환은행,123-123456-123,외환체크사용,2012/10/05,KRW,,"2,350","11,563,531"
It contain both Unicode text (Eg. 외환은행) and value with comma as value (Eg. 2,350).
The 2,350 needs to be read as 2350 instead of 2 and 350.
The screenshot below shows how each class read the line.
CommaTextIo correctly read Unicode text and amount with thousand separator
CommaIo correctly read the amount with thousand separator but not the Unicode text
TextIo correctly read Unicode text but not the amount with thousand separator (The amount supposed to be 2350 not 2 and 350)
AsciiIo incorrectly read both Unicode text and amount with thousand separator
Below is a sample code used to read the CSV file, the screenshots above are captured by running the job below for each of the class mentioned (CommaTextIo, CommaIo, TextIo, & AsciiIo).
static void TestReadCSV(Args _args)
{
#File
#define.comma(',')
CommaTextIo commaTextIo;
container lineCon;
;
commaTextIo = new CommaTextIO(@'C:\UnicodeSample.csv', #io_read);
commaTextIo.inFieldDelimiter(#comma);
commaTextIo.inRecordDelimiter(#delimiterCRLF);
lineCon = commaTextIo.read();
while(lineCon && (commaTextIo.status() == IO_Status::Ok))
{
info(conPeek(lineCon, 4));
lineCon = commaTextIo.read();
}
}
Thanks for the post!!!! It really helps!!
ReplyDelete