Getting started with Danfo.js
If you are processing a lot of data, you need some tools to simplify the process. One such tool is Danfo.js. According to its website: "Danfo.js is an open-source, JavaScript library providing high-performance, intuitive, and easy-to-use data structures for manipulating and processing structured data". If you are coming from a Python background, you are probably familar with Pandas. Danfo.js is its javascript equivalent.
This notebook shall give a short introduction on how to use Danfo.js. Let us start with importing the required dependencies.
Creating Dataframes
The basic tool for working with tabular data in Danfo.js is a Dataframe. A Dataframe is a 2-dimensional datastructure that can store data of different types in its columns. Each column and each row have labels. For many applications the labels of the rows are integer indices.
There are many ways to define Dataframes in Danfo.js. One way is with an array of JSON-objects that specify the entries of each label. Read the Danfo.js Getting Started guide for other ways to create dataframes.
You can display the dataframe by using the
toString()
method.The labels "A, B, C, D" of the columns are displayed at the top and the indices of the rows on the left side. The datatypes of each column can be displayed with the
ctypes
property.Creating Series
Each column in a DataFrame is a Series. When defining a Series, the integer indices are created automatically. A Series can be created with an array.
Display the Series with the
toString()
method:You can also create Series from Tensorflow tensors:
Viewing data
If you are working on large datasets, displaying only specific parts of the data becomes important. The label of a column can be used to obtain that column as a Series.
The
head
andtail
functions display the top or bottom rows of the dataframe.The
index
andcolumn
properties can be used to show the labels of the rows and columns.The
describe
method can be used to display common statistics about the dataframe.