csvkit
command line tool to manipulate excel and csv files
Resources
link to tutorial page and documentation really awesome, read this first
Installation for gitbash
pip install csvkitThis run without problems but installed in a different path
So I just copied the files into /usr/bin/
The scripts got installed here:
C:\Users\vld\AppData\Local\Packages\PythonSoftwareFoundation.Python.3.12_qbz5n2kfra8p0\LocalCache\local-packages\Python312\ScriptsTo get to the hidden AppData folder pres WindowButton + R and then type %AppData%
Adjust csv data and insert into a text document
Insert a table
Insert a table from "stacked.csv" into document in vim:
.r ! csvlook stacked.csv
data1-test.csv
1
2
20
data1-test.csv
2
13
34
data2-test.csv
1
3
205
data2-test.csv
2
13
34
data2-test.csv
3
13
34
data2-test.csv
4
13
34
data3-test.csv
2
13
34
data3-test.csv
3
13
34
data3-test.csv
4
13
34
Insert only selected columns of table
To insert only "group" and "peakNo" columns.
.r ! csvcut -c group,peakNo | csvlook
data1-test.csv
1
data1-test.csv
2
data2-test.csv
1
data2-test.csv
2
data2-test.csv
3
data2-test.csv
4
data3-test.csv
2
data3-test.csv
3
data3-test.csv
4
Insert table sorted by one or multiple columns
To insert a table sorted by two columns (max and peakNo):
.r ! csvsort -c max,peakNo stacked.csv | csvlook
data1-test.csv
1
2
20
data1-test.csv
2
13
34
data2-test.csv
2
13
34
data3-test.csv
2
13
34
data2-test.csv
3
13
34
data3-test.csv
3
13
34
data2-test.csv
4
13
34
data3-test.csv
4
13
34
data2-test.csv
1
3
205
-r flag: revert the sorting to descending
Tutorial
Getting example data
in2csv the excel killer
transform to csv and show in terminal
in2csv ne_1033_data.xlsx
transform to csv and save as data.csv
in2csv ne_1033_data.xlsx > data.csv
csvlook: dataperiscope
to get the rough idea of the data use csvlook
csvlook data.csv
to get it paged, separated in columns and able to navigate through, use less
csvlook data.csv | less -S
csvcut: datascalpel
to select and reorder columns
to display the numbers and names of the columns use -n flag*
csvcut -n data.csv
Insert table filtered by word in given column
To select only rows containing "data2" use -m flag:
.r ! csvgrep -c group -m data2 stacked.csv | csvlook
data2-test.csv
1
3
205
data2-test.csv
2
13
34
data2-test.csv
3
13
34
data2-test.csv
4
13
34
-i flag: to revert from select matching to return not matching
Insert table filtered by regular expression in given column
To filter rows which in group column contain "ta2" or "ta3" use -r flag:
.r ! csvgrep -c group -r "ta2|ta3" stacked.csv | csvlook
data2-test.csv
1
3
205
data2-test.csv
2
13
34
data2-test.csv
3
13
34
data2-test.csv
4
13
34
data3-test.csv
2
13
34
data3-test.csv
3
13
34
data3-test.csv
4
13
34
-i flag: to revert from select matching to return not matching
Filter table rows by matching list contained in a file
Cell in csv is matched exactly by line
To filter by file "filter-file" which contains 2 lines: data3-test.csv and data1-test.csv.\
Print the lines containing what is in the file\
.r ! csvgrep -c group -f filter-file stacked.csv | csvlook
data1-test.csv
1
2
20
data1-test.csv
2
13
34
data3-test.csv
2
13
34
data3-test.csv
3
13
34
data3-test.csv
4
13
34
print lines not containing what is in the file
.r ! csvgrep -c group -if filter-file stacked.csv | csvlook
data2-test.csv
1
3
205
data2-test.csv
2
13
34
data2-test.csv
3
13
34
data2-test.csv
4
13
34
Cell in csv contains string found in file
To filter by file which contains two lines: data3 and data1
.r ! cat stacked.csv | grep -Fvwf filter-file | csvlook
grep flags: -f: take from file -F: as fixed strings -W: only lines with whole words matches -v: select non matching lines (filtering)\
data2-test.csv
1
3
205
data2-test.csv
2
13
34
data2-test.csv
3
13
34
data2-test.csv
4
13
34
Stack first lines of csvs together - csvstack approach
stack the csvs --filename flag creates a column with filenames
csvstack --filenames data*t.csv > stacked.csvselect the (number of peaks) you want to keep
csvgrep -c peakNo -r "^[1-2]$" stacked.csv > filtered.csv
Or this is an alternative
csvgrep -c peakNo -r "^1|2$" stacked.csv > filtered.csv
Look at the data
csvlook filtered.csvDo a summary statistics
csvstat filtered.csv
Getting tables into markdown files
Installation for gitbash
pip --versioncheck if the pip is installedpip install csvkittry to installin case the output contains "Warning ... is not in the path", add the folder to the path following way:
PATH_BACKUP=$PATHcreate a copy of path just in casethe path is formatted in windows style
C:\something\somethingand needs to be reformatted into linux stylecopy the path of the folder indicated in the folder in text file
first replace the '\' with '/'
second change 'C:' into '/c/'
in the terminal add the changed string into the PATH using
PATH=$PATH:<your string>echo $PATHto see if it is there
run
csvkit --versionto see if it is installed
Last updated