pandora.imputation module

pandora.imputation.impute_data(input_data: ndarray[Any, dtype[_ScalarType_co]], imputation: str | None, missing_value: float | int = nan) ndarray[Any, dtype[_ScalarType_co]][source]

Imputes missing values in the given input data using the given imputation strategy.

Parameters:
input_datanpt.NDArray

Numpy array containing the input data to impute. Missing values are expected to be np.NaN.

imputationOptional[str]

Imputation method to use. Available options are:

  • "mean": Imputes missing values with the average of the respective column.

  • "remove": Removes all columns with at least one missing value.

  • None: Does not impute the given data.

missing_valueUnion[float, int], default=np.nan

Value to treat as missing value.

Returns:
imputed_datanpt.NDArray

Imputed input data with imputation according to the specified method.

Raises:
PandoraException
  • If no data is left in case of "remove" imputation strategy. That means that all columns in the input data contained at least one missing value.

  • If the imputation method is not supported.