Calculate average of every n rows from a csv file

Saeed :

I have a csv file that has 25000 rows. I want to put the average of every 30 rows in another csv file.

I've given an example with 9 rows as below and the new csv file has 3 rows (3, 1, 2):

|   H    |
 ========
|   1    |---\
|   3    |   |------------->| 3 |
|   5    |---/
|  -1    |---\
|   3    |   |------------->| 1 |
|   1    |---/
|   0    |---\
|   5    |   |------------->| 2 |
|   1    |---/

What I did:

import numpy as np
import pandas as pd

m_path = "ALL0001.CSV"

m_df = pd.read_csv(m_path, usecols=['Col-01']) 
m_arr =  np.array([])
temp = m_df.to_numpy()
step = 30
for i in range(1, 25000, step):
    arr = np.append(m_arr,np.array([np.average(temp[i:i + step])]))

data = np.array(m_arr)[np.newaxis]

m_df = pd.DataFrame({'Column1': data[0, :]})
m_df.to_csv('AVG.csv')

This works well but is there a better solution?

jezrael :

You can use integer division by step for consecutive groups and pass to groupby for aggregate mean:

step = 30
m_df = pd.read_csv(m_path, usecols=['Col-01']) 
df = m_df.groupby(m_df.index // step).mean()

Or:

df = m_df.groupby(np.arange(len(dfm_df// step).mean()

Sample data:

step = 3
df = m_df.groupby(m_df.index // step).mean()
print (df)
   H
0  3
1  1
2  2

Guess you like

Origin http://43.154.161.224:23101/article/api/json?id=359981&siteId=1