csv
Write Spark dataframe as CSV with partitions
Spark 2.0.0+: Built-in csv format supports partitioning out of the box so you should be able to simply use: df.write.partitionBy(‘partition_date’).mode(mode).format(“csv”).save(path) without including any additional packages. Spark < 2.0.0: At this moment (v1.4.0) spark-csv doesn’t support partitionBy (see databricks/spark-csv#123) but you can adjust built-in sources to achieve what you want. You can try two different approaches. … Read more
CSV with comma or semicolon?
In Windows it is dependent on the “Regional and Language Options” customize screen where you find a List separator. This is the char Windows applications expect to be the CSV separator. Of course this only has effect in Windows applications, for example Excel will not automatically split data into columns if the file is not … Read more
Can a CSV file have a comment?
The CSV “standard” (such as it is) does not dictate how comments should be handled, no, it’s up to the application to establish a convention and stick with it.
CodedUI test does not read data from CSV input file
Some text files start with a Byte Order Mark (BOM). The CSV reader within Coded UI does not handle the BOM and treats it as part of the first field name. The screen shot below shows the debug trace of a CSV file with a BOM and that same file shown in Notepad++. The DataRow.ItemArray[…] … Read more
UnicodeDecodeError: ‘charmap’ codec can’t decode byte 0x8d in position 7240: character maps to
i have solved this issue. we can use this code import codecs types_of_encoding = [“utf8”, “cp1252″] for encoding_type in types_of_encoding: with codecs.open(filename, encoding = encoding_type, errors=”replace”) as csvfile: your code …. ….
Export as csv in beeline hive
When hive version is at least 0.11.0 you can execute: INSERT OVERWRITE LOCAL DIRECTORY ‘/tmp/directoryWhereToStoreData’ ROW FORMAT DELIMITED FIELDS TERMINATED BY ‘,’ LINES TERMINATED BY “\n” SELECT * FROM yourTable; from hive/beeline to store the table into a directory on the local filesystem. Alternatively, with beeline, save your SELECT query in yourSQLFile.sql and run: beeline … Read more
Sort CSV file by multiple columns using the “sort” command
You need to use two options for the sort command: –field-separator (or -t) –key=<start,end> (or -k), to specify the sort key, i.e. which range of columns (start through end index) to sort by. Since you want to sort on 3 columns, you’ll need to specify -k 3 times, for columns 2,2, 1,1, and 3,3. To … Read more
How to check encoding of a CSV file
You can use Notepad++ to evaluate a file’s encoding without needing to write code. The evaluated encoding of the open file will display on the bottom bar, far right side. The encodings supported can be seen by going to Settings -> Preferences -> New Document/Default Directory and looking in the drop down.
VBScript to loop through all files in a folder
Maybe this will clear things up. (Or confuse you more, ) Const ForReading = 1 Const ForWriting = 2 sFolder = “H:\Letter Display\Letters\” Set oFSO = CreateObject(“Scripting.FileSystemObject”) For Each oFile In oFSO.GetFolder(sFolder).Files If UCase(oFSO.GetExtensionName(oFile.Name)) = “LTR” Then ProcessFiles oFSO, oFile End if Next Set oFSO = Nothing Sub ProcessFiles(FSO, File) Set oFile2 = FSO.OpenTextFile(File.path, ForReading) … Read more