Post Processing Data¶
The following provide information on how the historical data obtained in the form of xarray.DataArray as described here can be post-processed.
You can find an example on post-processing at :examples/coin_history_post_process.py
The currently offered post-processing capabilities are:
Type Conversion¶
The original data obtained from the exchange may or may not be set with the correct type. An example of this is Binance which provides the open (the opening value of the ticker) as a string. The type converter stores the same value as a float.
The types are stored in the OHLCVFields
-
class
crypto_history.data_container.data_container_post.
TypeConvertedData
¶ Type converts the data in the dataarray/dataset
-
get_ohlcv_field_type_dict
() → Dict[KT, VT]¶ Gets the field types of the OHLCV Fields converted to the numpy/pandas format. This is done to be able to handle nan values in Int/String types
Returns (dict): Dictionary of the map from the ohlcv-field to the pd/np type
-
set_type_on_dataarray
(dataarray: xarray.core.dataarray.DataArray) → xarray.core.dataarray.DataArray¶ Sets the type on the xr.DataArray according to the ohlcv field type :param dataarray: The DataArray on which the type has to be set :type dataarray: xr.DataArray
Returns: xr.DataArray which has the type set on it
-
set_type_on_dataset
(dataset: xarray.core.dataset.Dataset) → xarray.core.dataset.Dataset¶ Sets the type on the xr.DataSet according to the ohlcv field type
Parameters: dataset (xr.DataSet) – The dataset on which the type has to be set Returns: xr.DataSet which has the type set on it
-
type_mapping
= {<class 'int'>: <class 'pandas.core.arrays.integer.Int64Dtype'>, <class 'str'>: <class 'pandas.core.arrays.string_.StringDtype'>, <class 'float'>: <class 'numpy.float64'>}¶
-
Incomplete Data Deletion¶
Incomplete data from the xarray.DataArray or xarray.DataSet may have to be removed to avoid unexpected behaviour and to save memory. It offers removal of incomplete data in two ways. If all the data corresponding to a particular base or reference asset is not available, it can remove that coin from the xarray item. If one of the values corresponding to a particular ticker is nan, it can make the entire ticker contents nan.
-
class
crypto_history.data_container.data_container_post.
HandleIncompleteData
¶ Responsible for handling missing data: 1. If a certain coin has to be dropped if it is null 2. If a ticker has to be nulliifed as it has incomplete data
-
drop_xarray_coins_with_entire_na
(data_item: Union[xarray.core.dataarray.DataArray, xarray.core.dataset.Dataset]) → Union[xarray.core.dataarray.DataArray, xarray.core.dataset.Dataset]¶ Drops the coins from the base/reference asset if all its corresponding values are nan :param data_item: which contains information of the coin histories :type data_item: xr.DataArray/xr.DataSet
Returns: xr.DataArray/xr.DataSet where the coins have been dropped
-
get_all_coord_combinations
(data_item: Union[xarray.core.dataarray.DataArray, xarray.core.dataset.Dataset])¶ - Gets all the various combinations to iterate
- according to the coordinates to drop
Parameters: data_item (xr.DataArray/xr.DataSet) – data_item whose combinations need to be iterated over Yields: A dict with various combinations
-
nullify_incomplete_data_from_dataarray
(dataarray: xarray.core.dataarray.DataArray) → xarray.core.dataarray.DataArray¶ Nullifies incomplete data from the xr.DataArray :param dataarray: dataarray whose coordinates are to be nullified :type dataarray: xr.DataArray
Returns: xr.DataArray whose data has been nullified if incomplete
-
nullify_incomplete_data_from_dataset
(dataset: xarray.core.dataset.Dataset) → xarray.core.dataset.Dataset¶ Nullifies the incomplete data of datasets
Notes
- Using indexing to assign values to a
- subset of dataset (e.g., ds[dict(space=0)] = 1) is not yet supported. http://xarray.pydata.org/en/stable/indexing.html
Parameters: dataset (xr.DataSet) – dataset whose data is to be nullified Returns: xr.DataSet whose incomplete items are nullified
-