Comprendre le module wquantiles en Python

  • wquantiles ( github )

    Quantiles pondérés avec Python, y compris la médiane pondérée.

    Les principales méthodes sont quantiles et médianes. L'entrée de quantile est un tableau numpy (données), un tableau numpy de poids d'une dimension et la valeur du quantile (entre 0 et 1) à calculer. La pondération est appliquée le long du dernier axe.

    La médiane de la méthode est un alias de quantile (données, poids, 0,5) .

  • Code

    """
    Library to compute weighted quantiles, including the weighted median, of
    numpy arrays.
    """
    from __future__ import print_function
    import numpy as np
    
    __version__ = "0.4"
    
    
    def quantile_1D(data, weights, quantile):
        """
        Compute the weighted quantile of a 1D numpy array.
    
        Parameters
        ----------
        data : ndarray
            Input array (one dimension).
        weights : ndarray
            Array with the weights of the same size of `data`.
        quantile : float
            Quantile to compute. It must have a value between 0 and 1.
    
        Returns
        -------
        quantile_1D : float
            The output value.
        """
        # Check the data
        if not isinstance(data, np.matrix):
            data = np.asarray(data)
        if not isinstance(weights, np.matrix):
            weights = np.asarray(weights)
        nd = data.ndim
        if nd != 1:
            raise TypeError("data must be a one dimensional array")
        ndw = weights.ndim
        if ndw != 1:
            raise TypeError("weights must be a one dimensional array")
        if data.shape != weights.shape:
            raise TypeError("the length of data and weights must be the same")
        if ((quantile > 1.) or (quantile < 0.)):
            raise ValueError("quantile must have a value between 0. and 1.")
        # Sort the data
        ind_sorted = np.argsort(data)
        sorted_data = data[ind_sorted]
        sorted_weights = weights[ind_sorted]
        # Compute the auxiliary arrays
        Sn = np.cumsum(sorted_weights)
        # TODO: Check that the weights do not sum zero
        #assert Sn != 0, "The sum of the weights must not be zero"
        Pn = (Sn-0.5*sorted_weights)/np.sum(sorted_weights)
        # Get the value of the weighted median
        return np.interp(quantile, Pn, sorted_data)
    
    
    def quantile(data, weights, quantile):
        """
        Weighted quantile of an array with respect to the last axis.
    
        Parameters
        ----------
        data : ndarray
            Input array.
        weights : ndarray
            Array with the weights. It must have the same size of the last 
            axis of `data`.
        quantile : float
            Quantile to compute. It must have a value between 0 and 1.
    
        Returns
        -------
        quantile : float
            The output value.
        """
        # TODO: Allow to specify the axis
        nd = data.ndim
        if nd == 0:
            TypeError("data must have at least one dimension")
        elif nd == 1:
            return quantile_1D(data, weights, quantile)
        elif nd > 1:
            n = data.shape
            imr = data.reshape((np.prod(n[:-1]), n[-1]))
            result = np.apply_along_axis(quantile_1D, -1, imr, weights, quantile)
            return result.reshape(n[:-1])
    
    
    def median(data, weights):
        """
        Weighted median of an array with respect to the last axis.
    
        Alias for `quantile(data, weights, 0.5)`.
        """
        return quantile(data, weights, 0.5)
    
    
  • Les fonctions

  • np.argsort (données)

  • np.cumsum ()

Je suppose que tu aimes

Origine blog.csdn.net/The_Time_Runner/article/details/109230533
conseillé
Classement