Importing, parsing, processing and exporting datasets (with the Pandas module)

FIZ353 - Numerical Analysis | 09/10/2020

Importing a CSV file with Pandas
    Getting rid of the unnecessary header stuff
    Designating the column labels
    Designating the index using one of the columns
Accessing to the data
    Old-school style: referring via the col/row indexes
    Referring via the col/row labels
Filtering
Processing
Creating a DataFrame
Exporting the DataFrame to a CSV file


Emre S. Tasci emre.tasci@hacettepe.edu.tr

Numerical analysis deals with various kinds of data: be it experimental results, surveys, collected values, etc. But one thing is almost always sure: there's always a huge amount of data lying around and we will be trying to make the most of it.

The first step to data processing is, of course to introduce the data to our code via importing (load).

Most of the time, the data is collected in matrix form and stored in comma-separated value (CSV) form, e.g.,

location,Ankara,Ankara,Ankara lat,40.01502,40.01502,40.01502 lon,32.73532,32.73532,32.73532 asl,923.411,923.411,923.411 variable,Temperature,Relative Humidity,Precipitation Total unit,°C,%,mm level,2 m elevation corrected,2 m,sfc resolution,hourly,hourly,hourly aggregation,None,None,None timestamp,Ankara Temperature [2 m elevation corrected],Ankara Relative Humidity [2 m],Ankara Precipitation Total 20201001T0000,12.437169,59.0,0.0 20201001T0100,12.557169,63.0,0.0 20201001T0200,13.177169,68.0,0.0 20201001T0300,13.087169,76.0,0.0 20201001T0400,12.867169,81.0,0.0 20201001T0500,11.567169,88.0,0.0 20201001T0600,11.177169,90.0,0.0 20201001T0700,12.187169,87.0,0.0 20201001T0800,13.797169,78.0,0.0 20201001T0900,14.967169,72.0,0.0 20201001T1000,16.787169,62.0,0.0 20201001T1100,18.367168,55.0,0.0 20201001T1200,20.957169,41.0,0.0 20201001T1300,22.057169,36.0,0.0 (...) 20201008T2100,24.777168,22.0,0.0 20201008T2200,23.627169,24.0,0.0 20201008T2300,22.727169,27.0,0.0

The data above is taken from meteoblue site and contains the temperature, humidity and precipation data for Ankara for this month so far (8 days). But as you can observe, the actual data begins on the 11th line, with the previous lines containing identification information like location, coordinates, units, etc. This kind of preinformation lines are reffered to as headers.

Importing a CSV file with Pandas

Importing the data in a CSV file is pretty straight-forward using Pandas' read_csv command:

In [1]:
import pandas as pd
data_with_headerclutter = pd.read_csv("data/01_dataexport_20201008T180753.csv")
print(data_with_headerclutter)
          location       Ankara           Ankara.1             Ankara.2
0              lat     40.01502           40.01502             40.01502
1              lon     32.73532           32.73532             32.73532
2              asl      923.411            923.411              923.411
3         variable  Temperature  Relative Humidity  Precipitation Total
4             unit           °C                  %                   mm
..             ...          ...                ...                  ...
196  20201008T1900    26.437168               20.0                  0.0
197  20201008T2000     25.56717               21.0                  0.0
198  20201008T2100    24.777168               22.0                  0.0
199  20201008T2200    23.627169               24.0                  0.0
200  20201008T2300    22.727169               27.0                  0.0

[201 rows x 4 columns]

Here we see a couple of things, the foremost important one being that, Pandas don't worry about different types of variables as NumPy would. It just takes whatever it finds and -for the moment- treating the first line as the header row, takes all in.

Second, when we ask it to display the imported data, it prints a summary of things, 5 lines from the top and 5 lines from the bottom, also reporting the true size [201 x 4]. We can change the number of lines shown by setting the 'display.min_rows' & 'display.max_rows' options:

In [2]:
## Display a total of 6 rows only
#pd.set_option('display.min_rows', 6)
pd.set_option('display.max_rows', 6)
print(data_with_headerclutter)
## We can as well print it out by direct referral:
#data_with_headers
          location     Ankara  Ankara.1  Ankara.2
0              lat   40.01502  40.01502  40.01502
1              lon   32.73532  32.73532  32.73532
2              asl    923.411   923.411   923.411
..             ...        ...       ...       ...
198  20201008T2100  24.777168      22.0       0.0
199  20201008T2200  23.627169      24.0       0.0
200  20201008T2300  22.727169      27.0       0.0

[201 rows x 4 columns]
In [3]:
data_with_headerclutter.shape
Out[3]:
(201, 4)
In [4]:
## Display all rows 
pd.set_option('display.max_rows', None)
print(data_with_headerclutter)
          location                                        Ankara  \
0              lat                                      40.01502   
1              lon                                      32.73532   
2              asl                                       923.411   
3         variable                                   Temperature   
4             unit                                            °C   
5            level                       2 m elevation corrected   
6       resolution                                        hourly   
7      aggregation                                          None   
8        timestamp  Ankara Temperature [2 m elevation corrected]   
9    20201001T0000                                     12.437169   
10   20201001T0100                                     12.557169   
11   20201001T0200                                     13.177169   
12   20201001T0300                                     13.087169   
13   20201001T0400                                     12.867169   
14   20201001T0500                                     11.567169   
15   20201001T0600                                     11.177169   
16   20201001T0700                                     12.187169   
17   20201001T0800                                     13.797169   
18   20201001T0900                                     14.967169   
19   20201001T1000                                     16.787169   
20   20201001T1100                                     18.367168   
21   20201001T1200                                     20.957169   
22   20201001T1300                                     22.057169   
23   20201001T1400                                     22.857168   
24   20201001T1500                                     23.527168   
25   20201001T1600                                      23.40717   
26   20201001T1700                                     22.687168   
27   20201001T1800                                      21.41717   
28   20201001T1900                                     18.537169   
29   20201001T2000                                     16.797169   
30   20201001T2100                                     14.717169   
31   20201001T2200                                     12.627169   
32   20201001T2300                                     11.187169   
33   20201002T0000                                     9.9971695   
34   20201002T0100                                      9.057169   
35   20201002T0200                                      8.357169   
36   20201002T0300                                      7.727169   
37   20201002T0400                                      7.307169   
38   20201002T0500                                      7.107169   
39   20201002T0600                                      7.177169   
40   20201002T0700                                     7.1271687   
41   20201002T0800                                      9.537169   
42   20201002T0900                                     13.377169   
43   20201002T1000                                      16.73717   
44   20201002T1100                                     19.697168   
45   20201002T1200                                     21.047169   
46   20201002T1300                                     22.027168   
47   20201002T1400                                     22.787169   
48   20201002T1500                                     23.187168   
49   20201002T1600                                     23.037169   
50   20201002T1700                                     22.357168   
51   20201002T1800                                      20.83717   
52   20201002T1900                                      18.64717   
53   20201002T2000                                      17.67717   
54   20201002T2100                                      16.98717   
55   20201002T2200                                     16.087168   
56   20201002T2300                                     15.297169   
57   20201003T0000                                     13.867169   
58   20201003T0100                                     12.397169   
59   20201003T0200                                     11.107169   
60   20201003T0300                                     10.157169   
61   20201003T0400                                     10.587169   
62   20201003T0500                                     10.507169   
63   20201003T0600                                     10.147169   
64   20201003T0700                                      9.617169   
65   20201003T0800                                     11.397169   
66   20201003T0900                                      16.32717   
67   20201003T1000                                      20.09717   
68   20201003T1100                                     21.947168   
69   20201003T1200                                     23.217169   
70   20201003T1300                                     24.187168   
71   20201003T1400                                     24.887169   
72   20201003T1500                                     25.297169   
73   20201003T1600                                      25.16717   
74   20201003T1700                                     24.477169   
75   20201003T1800                                      22.58717   
76   20201003T1900                                      20.83717   
77   20201003T2000                                     19.797169   
78   20201003T2100                                     18.447168   
79   20201003T2200                                     16.707169   
80   20201003T2300                                     14.087169   
81   20201004T0000                                     11.627169   
82   20201004T0100                                      9.607169   
83   20201004T0200                                      8.857169   
84   20201004T0300                                      8.597169   
85   20201004T0400                                     10.287169   
86   20201004T0500                                     11.367169   
87   20201004T0600                                     11.787169   
88   20201004T0700                                     11.687169   
89   20201004T0800                                     12.857169   
90   20201004T0900                                     17.277168   
91   20201004T1000                                      20.51717   
92   20201004T1100                                     22.697168   
93   20201004T1200                                     24.217169   
94   20201004T1300                                     25.527168   
95   20201004T1400                                      26.48717   
96   20201004T1500                                      27.06717   
97   20201004T1600                                      27.06717   
98   20201004T1700                                      26.48717   
99   20201004T1800                                     24.357168   
100  20201004T1900                                     22.057169   
101  20201004T2000                                      20.31717   
102  20201004T2100                                      18.31717   
103  20201004T2200                                     15.907169   
104  20201004T2300                                     13.397169   
105  20201005T0000                                     11.437169   
106  20201005T0100                                     10.687169   
107  20201005T0200                                     10.317169   
108  20201005T0300                                     10.107169   
109  20201005T0400                                     11.477169   
110  20201005T0500                                     12.227169   
111  20201005T0600                                     12.567169   
112  20201005T0700                                     12.347169   
113  20201005T0800                                     13.457169   
114  20201005T0900                                     17.947168   
115  20201005T1000                                      21.64717   
116  20201005T1100                                     24.447168   
117  20201005T1200                                      27.00717   
118  20201005T1300                                      29.07717   
119  20201005T1400                                      30.67717   
120  20201005T1500                                     31.867168   
121  20201005T1600                                     31.697168   
122  20201005T1700                                     30.627169   
123  20201005T1800                                      28.06717   
124  20201005T1900                                     25.797169   
125  20201005T2000                                     22.367168   
126  20201005T2100                                      18.81717   
127  20201005T2200                                      16.58717   
128  20201005T2300                                     15.387169   
129  20201006T0000                                     14.857169   
130  20201006T0100                                     14.467169   
131  20201006T0200                                     14.337169   
132  20201006T0300                                     14.147169   
133  20201006T0400                                     15.477169   
134  20201006T0500                                     16.107168   
135  20201006T0600                                     16.297169   
136  20201006T0700                                     16.097168   
137  20201006T0800                                     17.227169   
138  20201006T0900                                      21.34717   
139  20201006T1000                                     25.057169   
140  20201006T1100                                     28.187168   
141  20201006T1200                                      30.57717   
142  20201006T1300                                      31.84717   
143  20201006T1400                                      32.27717   
144  20201006T1500                                      32.33717   
145  20201006T1600                                     31.947168   
146  20201006T1700                                     30.807169   
147  20201006T1800                                      28.65717   
148  20201006T1900                                     26.527168   
149  20201006T2000                                     22.297169   
150  20201006T2100                                      19.40717   
151  20201006T2200                                      17.01717   
152  20201006T2300                                     15.407169   
153  20201007T0000                                     14.817169   
154  20201007T0100                                    14.7471695   
155  20201007T0200                                     14.847169   
156  20201007T0300                                     15.127169   
157  20201007T0400                                      16.49717   
158  20201007T0500                                      17.00717   
159  20201007T0600                                      17.16717   
160  20201007T0700                                      17.07717   
161  20201007T0800                                      18.32717   
162  20201007T0900                                      22.24717   
163  20201007T1000                                      25.92717   
164  20201007T1100                                     28.807169   
165  20201007T1200                                     30.807169   
166  20201007T1300                                     31.717169   
167  20201007T1400                                     32.057167   
168  20201007T1500                                     32.107166   
169  20201007T1600                                      31.39717   
170  20201007T1700                                      30.33717   
171  20201007T1800                                      28.56717   
172  20201007T1900                                      27.07717   
173  20201007T2000                                      26.34717   
174  20201007T2100                                     25.867168   
175  20201007T2200                                     24.947168   
176  20201007T2300                                     24.117168   
177  20201008T0000                                     24.047169   
178  20201008T0100                                      22.50717   
179  20201008T0200                                     21.047169   
180  20201008T0300                                      19.51717   
181  20201008T0400                                     18.447168   
182  20201008T0500                                      17.41717   
183  20201008T0600                                      16.59717   
184  20201008T0700                                     15.627169   
185  20201008T0800                                      17.15717   
186  20201008T0900                                      22.51717   
187  20201008T1000                                     25.977169   
188  20201008T1100                                      28.51717   
189  20201008T1200                                      29.56717   
190  20201008T1300                                     29.137169   
191  20201008T1400                                      30.00717   
192  20201008T1500                                     30.287169   
193  20201008T1600                                     30.357168   
194  20201008T1700                                     29.857168   
195  20201008T1800                                      28.33717   
196  20201008T1900                                     26.437168   
197  20201008T2000                                      25.56717   
198  20201008T2100                                     24.777168   
199  20201008T2200                                     23.627169   
200  20201008T2300                                     22.727169   

                           Ankara.1                    Ankara.2  
0                          40.01502                    40.01502  
1                          32.73532                    32.73532  
2                           923.411                     923.411  
3                 Relative Humidity         Precipitation Total  
4                                 %                          mm  
5                               2 m                         sfc  
6                            hourly                      hourly  
7                              None                        None  
8    Ankara Relative Humidity [2 m]  Ankara Precipitation Total  
9                              59.0                         0.0  
10                             63.0                         0.0  
11                             68.0                         0.0  
12                             76.0                         0.0  
13                             81.0                         0.0  
14                             88.0                         0.0  
15                             90.0                         0.0  
16                             87.0                         0.0  
17                             78.0                         0.0  
18                             72.0                         0.0  
19                             62.0                         0.0  
20                             55.0                         0.0  
21                             41.0                         0.0  
22                             36.0                         0.0  
23                             33.0                         0.0  
24                             29.0                         0.0  
25                             28.0                         0.0  
26                             28.0                         0.0  
27                             30.0                         0.0  
28                             35.0                         0.0  
29                             39.0                         0.0  
30                             44.0                         0.0  
31                             54.0                         0.0  
32                             64.0                         0.0  
33                             72.0                         0.0  
34                             80.0                         0.0  
35                             86.0                         0.0  
36                             89.0                         0.0  
37                             90.0                         0.0  
38                             92.0                         0.0  
39                             91.0                         0.0  
40                             90.0                         0.0  
41                             76.0                         0.0  
42                             64.0                         0.0  
43                             52.0                         0.0  
44                             38.0                         0.0  
45                             29.0                         0.0  
46                             26.0                         0.0  
47                             23.0                         0.0  
48                             22.0                         0.0  
49                             20.0                         0.0  
50                             21.0                         0.0  
51                             23.0                         0.0  
52                             26.0                         0.0  
53                             28.0                         0.0  
54                             29.0                         0.0  
55                             31.0                         0.0  
56                             33.0                         0.0  
57                             36.0                         0.0  
58                             41.0                         0.0  
59                             46.0                         0.0  
60                             51.0                         0.0  
61                             51.0                         0.0  
62                             50.0                         0.0  
63                             49.0                         0.0  
64                             47.0                         0.0  
65                             42.0                         0.0  
66                             31.0                         0.0  
67                             24.0                         0.0  
68                             21.0                         0.0  
69                             19.0                         0.0  
70                             18.0                         0.0  
71                             16.0                         0.0  
72                             16.0                         0.0  
73                             15.0                         0.0  
74                             16.0                         0.0  
75                             19.0                         0.0  
76                             21.0                         0.0  
77                             23.0                         0.0  
78                             25.0                         0.0  
79                             29.0                         0.0  
80                             35.0                         0.0  
81                             44.0                         0.0  
82                             57.0                         0.0  
83                             65.0                         0.0  
84                             70.0                         0.0  
85                             62.0                         0.0  
86                             55.0                         0.0  
87                             51.0                         0.0  
88                             50.0                         0.0  
89                             46.0                         0.0  
90                             35.0                         0.0  
91                             28.0                         0.0  
92                             25.0                         0.0  
93                             21.0                         0.0  
94                             18.0                         0.0  
95                             15.0                         0.0  
96                             14.0                         0.0  
97                             14.0                         0.0  
98                             15.0                         0.0  
99                             17.0                         0.0  
100                            20.0                         0.0  
101                            23.0                         0.0  
102                            27.0                         0.0  
103                            32.0                         0.0  
104                            38.0                         0.0  
105                            46.0                         0.0  
106                            50.0                         0.0  
107                            54.0                         0.0  
108                            55.0                         0.0  
109                            51.0                         0.0  
110                            48.0                         0.0  
111                            46.0                         0.0  
112                            46.0                         0.0  
113                            42.0                         0.0  
114                            32.0                         0.0  
115                            25.0                         0.0  
116                            21.0                         0.0  
117                            18.0                         0.0  
118                            15.0                         0.0  
119                            14.0                         0.0  
120                            13.0                         0.0  
121                            14.0                         0.0  
122                            15.0                         0.0  
123                            19.0                         0.0  
124                            23.0                         0.0  
125                            29.0                         0.0  
126                            36.0                         0.0  
127                            42.0                         0.0  
128                            45.0                         0.0  
129                            47.0                         0.0  
130                            49.0                         0.0  
131                            50.0                         0.0  
132                            51.0                         0.0  
133                            47.0                         0.0  
134                            46.0                         0.0  
135                            45.0                         0.0  
136                            46.0                         0.0  
137                            43.0                         0.0  
138                            33.0                         0.0  
139                            27.0                         0.0  
140                            22.0                         0.0  
141                            18.0                         0.0  
142                            16.0                         0.0  
143                            15.0                         0.0  
144                            15.0                         0.0  
145                            15.0                         0.0  
146                            16.0                         0.0  
147                            19.0                         0.0  
148                            22.0                         0.0  
149                            29.0                         0.0  
150                            35.0                         0.0  
151                            42.0                         0.0  
152                            48.0                         0.0  
153                            51.0                         0.0  
154                            51.0                         0.0  
155                            51.0                         0.0  
156                            49.0                         0.0  
157                            44.0                         0.0  
158                            42.0                         0.0  
159                            41.0                         0.0  
160                            42.0                         0.0  
161                            39.0                         0.0  
162                            31.0                         0.0  
163                            25.0                         0.0  
164                            20.0                         0.0  
165                            18.0                         0.0  
166                            17.0                         0.0  
167                            16.0                         0.0  
168                            15.0                         0.0  
169                            15.0                         0.0  
170                            17.0                         0.0  
171                            20.0                         0.0  
172                            22.0                         0.0  
173                            24.0                         0.0  
174                            24.0                         0.0  
175                            26.0                         0.0  
176                            27.0                         0.0  
177                            28.0                         0.0  
178                            30.0                         0.0  
179                            33.0                         0.0  
180                            37.0                         0.0  
181                            40.0                         0.0  
182                            44.0                         0.0  
183                            50.0                         0.0  
184                            57.0                         0.0  
185                            52.0                         0.0  
186                            36.0                         0.0  
187                            30.0                         0.0  
188                            26.0                         0.0  
189                            24.0                         0.0  
190                            19.0                         0.0  
191                            17.0                         0.0  
192                            16.0                         0.0  
193                            16.0                         0.0  
194                            16.0                         0.0  
195                            18.0                         0.0  
196                            20.0                         0.0  
197                            21.0                         0.0  
198                            22.0                         0.0  
199                            24.0                         0.0  
200                            27.0                         0.0  

Getting rid of the unnecessary header stuff

The headers are getting in the way (especially the 10th line ("timestamp | Ankara Temp...") even messes the output due to its enormous width. So we simply tell Pandas to start parsing from the 11th line (10th on the zero-indexed, thus we will be skipping 9 lines):

In [5]:
pd.set_option('display.min_rows', 10)
pd.set_option('display.max_rows', 10)
data_without_headerclutter = pd.read_csv("data/01_dataexport_20201008T180753.csv",
                                         skiprows=9)
data_without_headerclutter
Out[5]:
timestamp Ankara Temperature [2 m elevation corrected] Ankara Relative Humidity [2 m] Ankara Precipitation Total
0 20201001T0000 12.437169 59.0 0.0
1 20201001T0100 12.557169 63.0 0.0
2 20201001T0200 13.177169 68.0 0.0
3 20201001T0300 13.087169 76.0 0.0
4 20201001T0400 12.867169 81.0 0.0
... ... ... ... ...
187 20201008T1900 26.437168 20.0 0.0
188 20201008T2000 25.567170 21.0 0.0
189 20201008T2100 24.777168 22.0 0.0
190 20201008T2200 23.627169 24.0 0.0
191 20201008T2300 22.727169 27.0 0.0

192 rows × 4 columns

Designating the column labels

... almost there but it has assumed the previous line before that as the column label line (which is usually true but this time there is just too much clutter), so we'll manually enter the column labels:

In [6]:
data_without_headerclutter.columns = ['Timestamp','Temperature','Relative Humidity','Precipitation Total']
print(data_without_headerclutter)
         Timestamp  Temperature  Relative Humidity  Precipitation Total
0    20201001T0000    12.437169               59.0                  0.0
1    20201001T0100    12.557169               63.0                  0.0
2    20201001T0200    13.177169               68.0                  0.0
3    20201001T0300    13.087169               76.0                  0.0
4    20201001T0400    12.867169               81.0                  0.0
..             ...          ...                ...                  ...
187  20201008T1900    26.437168               20.0                  0.0
188  20201008T2000    25.567170               21.0                  0.0
189  20201008T2100    24.777168               22.0                  0.0
190  20201008T2200    23.627169               24.0                  0.0
191  20201008T2300    22.727169               27.0                  0.0

[192 rows x 4 columns]
In [7]:
data_without_headerclutter
Out[7]:
Timestamp Temperature Relative Humidity Precipitation Total
0 20201001T0000 12.437169 59.0 0.0
1 20201001T0100 12.557169 63.0 0.0
2 20201001T0200 13.177169 68.0 0.0
3 20201001T0300 13.087169 76.0 0.0
4 20201001T0400 12.867169 81.0 0.0
... ... ... ... ...
187 20201008T1900 26.437168 20.0 0.0
188 20201008T2000 25.567170 21.0 0.0
189 20201008T2100 24.777168 22.0 0.0
190 20201008T2200 23.627169 24.0 0.0
191 20201008T2300 22.727169 27.0 0.0

192 rows × 4 columns

Designating the index using one of the columns

Currently, the index (row ids) is enumerated. But if we wish, we can also designate a column that holds unique values (such as the Timestamp column in our example) via the set_index command:

In [8]:
data_without_headerclutter = data_without_headerclutter.set_index('Timestamp')
data_without_headerclutter
Out[8]:
Temperature Relative Humidity Precipitation Total
Timestamp
20201001T0000 12.437169 59.0 0.0
20201001T0100 12.557169 63.0 0.0
20201001T0200 13.177169 68.0 0.0
20201001T0300 13.087169 76.0 0.0
20201001T0400 12.867169 81.0 0.0
... ... ... ...
20201008T1900 26.437168 20.0 0.0
20201008T2000 25.567170 21.0 0.0
20201008T2100 24.777168 22.0 0.0
20201008T2200 23.627169 24.0 0.0
20201008T2300 22.727169 27.0 0.0

192 rows × 3 columns

and thus we now have 3 columns with the former Timestamp column being the index (row identifier).

We can get the list of the column names and row indexes:

In [9]:
# Columns
data_without_headerclutter.columns
Out[9]:
Index(['Temperature', 'Relative Humidity', 'Precipitation Total'], dtype='object')
In [10]:
# Row indexes
data_without_headerclutter.index
Out[10]:
Index(['20201001T0000', '20201001T0100', '20201001T0200', '20201001T0300',
       '20201001T0400', '20201001T0500', '20201001T0600', '20201001T0700',
       '20201001T0800', '20201001T0900',
       ...
       '20201008T1400', '20201008T1500', '20201008T1600', '20201008T1700',
       '20201008T1800', '20201008T1900', '20201008T2000', '20201008T2100',
       '20201008T2200', '20201008T2300'],
      dtype='object', name='Timestamp', length=192)

Accessing to the data

Now that we have managed to import the datafile, we have all the data under our reach - hooray! 8)

From here on, it runs more or less in the same vein as a NumPy or GNU Octave/MATLAB arrays, with the main difference being the ability to also refer directly using the column and row labels.

Old-school style: referring via the col/row indexes

For this kind of referrence, we use the iloc command. But before we do that, let me redefine the dataset name to df (for 'DataFrame').

In [11]:
df = data_without_headerclutter 
# (I'm just tired of typing 'data_without_headerclutter' all the time!)
In [12]:
df
Out[12]:
Temperature Relative Humidity Precipitation Total
Timestamp
20201001T0000 12.437169 59.0 0.0
20201001T0100 12.557169 63.0 0.0
20201001T0200 13.177169 68.0 0.0
20201001T0300 13.087169 76.0 0.0
20201001T0400 12.867169 81.0 0.0
... ... ... ...
20201008T1900 26.437168 20.0 0.0
20201008T2000 25.567170 21.0 0.0
20201008T2100 24.777168 22.0 0.0
20201008T2200 23.627169 24.0 0.0
20201008T2300 22.727169 27.0 0.0

192 rows × 3 columns

In [13]:
df.iloc[2,0] # Row 2, Col 0
Out[13]:
13.177169000000001
In [14]:
df.iloc[[4,1],[1,0]] # Rows 4 and 1 && Cols 1 and 0
Out[14]:
Relative Humidity Temperature
Timestamp
20201001T0400 81.0 12.867169
20201001T0100 63.0 12.557169
In [15]:
df.iloc[1:4,0:2] # Rows [1,4) && Cols [0,2)
Out[15]:
Temperature Relative Humidity
Timestamp
20201001T0100 12.557169 63.0
20201001T0200 13.177169 68.0
20201001T0300 13.087169 76.0
In [16]:
df.iloc[[1,3,6],:] # Rows 1,3 and 6 && All cols
Out[16]:
Temperature Relative Humidity Precipitation Total
Timestamp
20201001T0100 12.557169 63.0 0.0
20201001T0300 13.087169 76.0 0.0
20201001T0600 11.177169 90.0 0.0

Referring via the col/row labels

And then, we have the option to call by the labels, using loc:

In [17]:
df.loc[['20201001T0300'],['Temperature','Precipitation Total']]
Out[17]:
Temperature Precipitation Total
Timestamp
20201001T0300 13.087169 0.0

Filtering

For filtering, we just make a proposition, and get the True/False Boolean results. Let's work with a smaller dataframe:

In [18]:
sdf = df.iloc[0:5,0:2]
sdf
Out[18]:
Temperature Relative Humidity
Timestamp
20201001T0000 12.437169 59.0
20201001T0100 12.557169 63.0
20201001T0200 13.177169 68.0
20201001T0300 13.087169 76.0
20201001T0400 12.867169 81.0

Let's try to find those entries with temperature below 13 degrees:

In [19]:
sdf.iloc[:,[0]] < 13
Out[19]:
Temperature
Timestamp
20201001T0000 True
20201001T0100 True
20201001T0200 False
20201001T0300 False
20201001T0400 True

So we see that for 3 of the 5 entries, this assertion is correct, let's pick them:

In [20]:
filter1 = sdf.iloc[:,[0]] < 13
sdf.loc[filter1['Temperature']==True]
Out[20]:
Temperature Relative Humidity
Timestamp
20201001T0000 12.437169 59.0
20201001T0100 12.557169 63.0
20201001T0400 12.867169 81.0

We can achieve the same thing by going straight to the heart of the issue using the column name directly:

In [21]:
sdf.Temperature < 13
Out[21]:
Timestamp
20201001T0000     True
20201001T0100     True
20201001T0200    False
20201001T0300    False
20201001T0400     True
Name: Temperature, dtype: bool

and then feeding this as the index:

In [22]:
sdf[sdf.Temperature < 13]
Out[22]:
Temperature Relative Humidity
Timestamp
20201001T0000 12.437169 59.0
20201001T0100 12.557169 63.0
20201001T0400 12.867169 81.0

What about the row index? Can we specify a criteria for them as well? First let's re-display our small dataframe:

In [23]:
sdf
Out[23]:
Temperature Relative Humidity
Timestamp
20201001T0000 12.437169 59.0
20201001T0100 12.557169 63.0
20201001T0200 13.177169 68.0
20201001T0300 13.087169 76.0
20201001T0400 12.867169 81.0

Let's seek the ones recorded before (and including) 01:00 or the one at 04:00 (here or is for the union, not intersection!)

In [24]:
# Joining multiple criteria based on row label (index)
sdf[(sdf.index <= '20201001T0100') | (sdf.index == '20201001T0400')]
Out[24]:
Temperature Relative Humidity
Timestamp
20201001T0000 12.437169 59.0
20201001T0100 12.557169 63.0
20201001T0400 12.867169 81.0

As the row index is string, we don't even need to write the whole timestamp to the end to make the comparison:

In [25]:
sdf.index > "20201001T02"
Out[25]:
array([False, False,  True,  True,  True])

Processing

Now that we know how to slice via filtering, we can do whatever we want with the sections of the data we're interested in. For example, let us calculate the mean temperature in October, 1st:

In [26]:
# First get the October 1st day's Temperature data:
oct1_temp = df[df.index < "20201002"]['Temperature']
print(oct1_temp)
Timestamp
20201001T0000    12.437169
20201001T0100    12.557169
20201001T0200    13.177169
20201001T0300    13.087169
20201001T0400    12.867169
                   ...    
20201001T1900    18.537169
20201001T2000    16.797169
20201001T2100    14.717169
20201001T2200    12.627169
20201001T2300    11.187169
Name: Temperature, Length: 24, dtype: float64
In [27]:
oct1_temp.mean()
Out[27]:
16.406335583333334

Let's calculate all the daily mean temperatures:

In [28]:
for i in range(1,9):
    day_temp = df[(df.index > "2020100"+str(i-1)) & (df.index < "2020100"+str(i+1))]['Temperature']
    day_temp_mean = day_temp.mean()
    print ("October, {} mean temperature: {:5.2f} degree C".format(i,day_temp_mean))
October, 1 mean temperature: 16.41 degree C
October, 2 mean temperature: 15.77 degree C
October, 3 mean temperature: 16.27 degree C
October, 4 mean temperature: 17.63 degree C
October, 5 mean temperature: 18.85 degree C
October, 6 mean temperature: 21.03 degree C
October, 7 mean temperature: 22.93 degree C
October, 8 mean temperature: 23.92 degree C

Creating a dataframe

We can directly create a dataframe with the DataFrame command:

In [29]:
# Start with an empty dataframe:
mean_temps = pd.DataFrame({'day' : [],'meanTemp' : []})
mean_temps
Out[29]:
day meanTemp
In [30]:
# We can fill it individually
mean_temps.append({'day': "20201001", 'meanTemp' : 16.41}, ignore_index=True)
Out[30]:
day meanTemp
0 20201001 16.41
In [31]:
# or create it from the existing list
days = []
means = []
for i in range(1,9):
    day_temp = df[(df.index > "2020100"+str(i-1)) & (df.index < "2020100"+str(i+1))]\
        ['Temperature']
    day_temp_mean = day_temp.mean()
    # print ("October, {} mean temperature: {:5.2f} degree C".format(i,day_temp_mean))
    days.append("2020100"+str(i))
    means.append(day_temp_mean)

data = {'days': days, 'means': means}
df_mean_temps = pd.DataFrame(data)
df_mean_temps
Out[31]:
days means
0 20201001 16.406336
1 20201002 15.769461
2 20201003 16.271544
3 20201004 17.629669
4 20201005 18.854461
5 20201006 21.028211
6 20201007 22.930294
7 20201008 23.916753
In [32]:
new_entry = pd.Series({'days': "20201013",'means':"23.5"})
df_mean_temps = df_mean_temps.append(new_entry, ignore_index=True)
df_mean_temps
Out[32]:
days means
0 20201001 16.4063
1 20201002 15.7695
2 20201003 16.2715
3 20201004 17.6297
4 20201005 18.8545
5 20201006 21.0282
6 20201007 22.9303
7 20201008 23.9168
8 20201013 23.5
In [33]:
# Specifying the index:
new_entry = pd.Series({'days': "20201011",'means':"24.5"}, name = 'k14')
df_mean_temps = df_mean_temps.append(new_entry)
In [34]:
df_mean_temps.index
Out[34]:
Index([0, 1, 2, 3, 4, 5, 6, 7, 8, 'k14'], dtype='object')
In [35]:
df_mean_temps.shape
Out[35]:
(10, 2)
In [36]:
df_mean_temps.iloc[[9],:]
Out[36]:
days means
k14 20201011 24.5
In [37]:
df_mean_temps.loc[['k14'],:]
Out[37]:
days means
k14 20201011 24.5

Exporting the DataFrame to a CSV file

In [38]:
df_mean_temps
Out[38]:
days means
0 20201001 16.4063
1 20201002 15.7695
2 20201003 16.2715
3 20201004 17.6297
4 20201005 18.8545
5 20201006 21.0282
6 20201007 22.9303
7 20201008 23.9168
8 20201013 23.5
k14 20201011 24.5
In [39]:
# With the row labels (index) and column labels included:
df_mean_temps.to_csv('data/01_out.csv')
print(df_mean_temps.to_csv())
,days,means
0,20201001,16.406335583333334
1,20201002,15.769460629166666
2,20201003,16.271544045833334
3,20201004,17.629669104166666
4,20201005,18.854460770833334
5,20201006,21.028210770833333
6,20201007,22.93029415625
7,20201008,23.916752552083334
8,20201013,23.5
k14,20201011,24.5

In [40]:
# Without the row labels but with the column labels:
print(df_mean_temps.to_csv(index=False))
days,means
20201001,16.406335583333334
20201002,15.769460629166666
20201003,16.271544045833334
20201004,17.629669104166666
20201005,18.854460770833334
20201006,21.028210770833333
20201007,22.93029415625
20201008,23.916752552083334
20201013,23.5
20201011,24.5

In [41]:
# Without the row labels and the column labels:
print(df_mean_temps.to_csv(index=False, header=False))
20201001,16.406335583333334
20201002,15.769460629166666
20201003,16.271544045833334
20201004,17.629669104166666
20201005,18.854460770833334
20201006,21.028210770833333
20201007,22.93029415625
20201008,23.916752552083334
20201013,23.5
20201011,24.5

In [42]:
# Specify header for the index column:
print(df_mean_temps.to_csv(index_label='id'))
id,days,means
0,20201001,16.406335583333334
1,20201002,15.769460629166666
2,20201003,16.271544045833334
3,20201004,17.629669104166666
4,20201005,18.854460770833334
5,20201006,21.028210770833333
6,20201007,22.93029415625
7,20201008,23.916752552083334
8,20201013,23.5
k14,20201011,24.5

In [43]:
# Use ';' as the seperator, instead of ','
df_mean_temps.to_csv(sep=';',path_or_buf='data/01_out.csv')
print(df_mean_temps.to_csv(sep=';'))
;days;means
0;20201001;16.406335583333334
1;20201002;15.769460629166666
2;20201003;16.271544045833334
3;20201004;17.629669104166666
4;20201005;18.854460770833334
5;20201006;21.028210770833333
6;20201007;22.93029415625
7;20201008;23.916752552083334
8;20201013;23.5
k14;20201011;24.5

NumPy way of doing things

Even though Pandas offer very flexible options of doing things, we could as well have used NumPy:

In [1]:
import numpy as np

data = np.genfromtxt("data/01_dataexport_20201008T180753.csv", delimiter=',')
print(data)
[[        nan         nan         nan         nan]
 [        nan  40.01502    40.01502    40.01502  ]
 [        nan  32.73532    32.73532    32.73532  ]
 [        nan 923.411     923.411     923.411    ]
 [        nan         nan         nan         nan]
 [        nan         nan         nan         nan]
 [        nan         nan         nan         nan]
 [        nan         nan         nan         nan]
 [        nan         nan         nan         nan]
 [        nan         nan         nan         nan]
 [        nan  12.437169   59.          0.       ]
 [        nan  12.557169   63.          0.       ]
 [        nan  13.177169   68.          0.       ]
 [        nan  13.087169   76.          0.       ]
 [        nan  12.867169   81.          0.       ]
 [        nan  11.567169   88.          0.       ]
 [        nan  11.177169   90.          0.       ]
 [        nan  12.187169   87.          0.       ]
 [        nan  13.797169   78.          0.       ]
 [        nan  14.967169   72.          0.       ]
 [        nan  16.787169   62.          0.       ]
 [        nan  18.367168   55.          0.       ]
 [        nan  20.957169   41.          0.       ]
 [        nan  22.057169   36.          0.       ]
 [        nan  22.857168   33.          0.       ]
 [        nan  23.527168   29.          0.       ]
 [        nan  23.40717    28.          0.       ]
 [        nan  22.687168   28.          0.       ]
 [        nan  21.41717    30.          0.       ]
 [        nan  18.537169   35.          0.       ]
 [        nan  16.797169   39.          0.       ]
 [        nan  14.717169   44.          0.       ]
 [        nan  12.627169   54.          0.       ]
 [        nan  11.187169   64.          0.       ]
 [        nan   9.9971695  72.          0.       ]
 [        nan   9.057169   80.          0.       ]
 [        nan   8.357169   86.          0.       ]
 [        nan   7.727169   89.          0.       ]
 [        nan   7.307169   90.          0.       ]
 [        nan   7.107169   92.          0.       ]
 [        nan   7.177169   91.          0.       ]
 [        nan   7.1271687  90.          0.       ]
 [        nan   9.537169   76.          0.       ]
 [        nan  13.377169   64.          0.       ]
 [        nan  16.73717    52.          0.       ]
 [        nan  19.697168   38.          0.       ]
 [        nan  21.047169   29.          0.       ]
 [        nan  22.027168   26.          0.       ]
 [        nan  22.787169   23.          0.       ]
 [        nan  23.187168   22.          0.       ]
 [        nan  23.037169   20.          0.       ]
 [        nan  22.357168   21.          0.       ]
 [        nan  20.83717    23.          0.       ]
 [        nan  18.64717    26.          0.       ]
 [        nan  17.67717    28.          0.       ]
 [        nan  16.98717    29.          0.       ]
 [        nan  16.087168   31.          0.       ]
 [        nan  15.297169   33.          0.       ]
 [        nan  13.867169   36.          0.       ]
 [        nan  12.397169   41.          0.       ]
 [        nan  11.107169   46.          0.       ]
 [        nan  10.157169   51.          0.       ]
 [        nan  10.587169   51.          0.       ]
 [        nan  10.507169   50.          0.       ]
 [        nan  10.147169   49.          0.       ]
 [        nan   9.617169   47.          0.       ]
 [        nan  11.397169   42.          0.       ]
 [        nan  16.32717    31.          0.       ]
 [        nan  20.09717    24.          0.       ]
 [        nan  21.947168   21.          0.       ]
 [        nan  23.217169   19.          0.       ]
 [        nan  24.187168   18.          0.       ]
 [        nan  24.887169   16.          0.       ]
 [        nan  25.297169   16.          0.       ]
 [        nan  25.16717    15.          0.       ]
 [        nan  24.477169   16.          0.       ]
 [        nan  22.58717    19.          0.       ]
 [        nan  20.83717    21.          0.       ]
 [        nan  19.797169   23.          0.       ]
 [        nan  18.447168   25.          0.       ]
 [        nan  16.707169   29.          0.       ]
 [        nan  14.087169   35.          0.       ]
 [        nan  11.627169   44.          0.       ]
 [        nan   9.607169   57.          0.       ]
 [        nan   8.857169   65.          0.       ]
 [        nan   8.597169   70.          0.       ]
 [        nan  10.287169   62.          0.       ]
 [        nan  11.367169   55.          0.       ]
 [        nan  11.787169   51.          0.       ]
 [        nan  11.687169   50.          0.       ]
 [        nan  12.857169   46.          0.       ]
 [        nan  17.277168   35.          0.       ]
 [        nan  20.51717    28.          0.       ]
 [        nan  22.697168   25.          0.       ]
 [        nan  24.217169   21.          0.       ]
 [        nan  25.527168   18.          0.       ]
 [        nan  26.48717    15.          0.       ]
 [        nan  27.06717    14.          0.       ]
 [        nan  27.06717    14.          0.       ]
 [        nan  26.48717    15.          0.       ]
 [        nan  24.357168   17.          0.       ]
 [        nan  22.057169   20.          0.       ]
 [        nan  20.31717    23.          0.       ]
 [        nan  18.31717    27.          0.       ]
 [        nan  15.907169   32.          0.       ]
 [        nan  13.397169   38.          0.       ]
 [        nan  11.437169   46.          0.       ]
 [        nan  10.687169   50.          0.       ]
 [        nan  10.317169   54.          0.       ]
 [        nan  10.107169   55.          0.       ]
 [        nan  11.477169   51.          0.       ]
 [        nan  12.227169   48.          0.       ]
 [        nan  12.567169   46.          0.       ]
 [        nan  12.347169   46.          0.       ]
 [        nan  13.457169   42.          0.       ]
 [        nan  17.947168   32.          0.       ]
 [        nan  21.64717    25.          0.       ]
 [        nan  24.447168   21.          0.       ]
 [        nan  27.00717    18.          0.       ]
 [        nan  29.07717    15.          0.       ]
 [        nan  30.67717    14.          0.       ]
 [        nan  31.867168   13.          0.       ]
 [        nan  31.697168   14.          0.       ]
 [        nan  30.627169   15.          0.       ]
 [        nan  28.06717    19.          0.       ]
 [        nan  25.797169   23.          0.       ]
 [        nan  22.367168   29.          0.       ]
 [        nan  18.81717    36.          0.       ]
 [        nan  16.58717    42.          0.       ]
 [        nan  15.387169   45.          0.       ]
 [        nan  14.857169   47.          0.       ]
 [        nan  14.467169   49.          0.       ]
 [        nan  14.337169   50.          0.       ]
 [        nan  14.147169   51.          0.       ]
 [        nan  15.477169   47.          0.       ]
 [        nan  16.107168   46.          0.       ]
 [        nan  16.297169   45.          0.       ]
 [        nan  16.097168   46.          0.       ]
 [        nan  17.227169   43.          0.       ]
 [        nan  21.34717    33.          0.       ]
 [        nan  25.057169   27.          0.       ]
 [        nan  28.187168   22.          0.       ]
 [        nan  30.57717    18.          0.       ]
 [        nan  31.84717    16.          0.       ]
 [        nan  32.27717    15.          0.       ]
 [        nan  32.33717    15.          0.       ]
 [        nan  31.947168   15.          0.       ]
 [        nan  30.807169   16.          0.       ]
 [        nan  28.65717    19.          0.       ]
 [        nan  26.527168   22.          0.       ]
 [        nan  22.297169   29.          0.       ]
 [        nan  19.40717    35.          0.       ]
 [        nan  17.01717    42.          0.       ]
 [        nan  15.407169   48.          0.       ]
 [        nan  14.817169   51.          0.       ]
 [        nan  14.7471695  51.          0.       ]
 [        nan  14.847169   51.          0.       ]
 [        nan  15.127169   49.          0.       ]
 [        nan  16.49717    44.          0.       ]
 [        nan  17.00717    42.          0.       ]
 [        nan  17.16717    41.          0.       ]
 [        nan  17.07717    42.          0.       ]
 [        nan  18.32717    39.          0.       ]
 [        nan  22.24717    31.          0.       ]
 [        nan  25.92717    25.          0.       ]
 [        nan  28.807169   20.          0.       ]
 [        nan  30.807169   18.          0.       ]
 [        nan  31.717169   17.          0.       ]
 [        nan  32.057167   16.          0.       ]
 [        nan  32.107166   15.          0.       ]
 [        nan  31.39717    15.          0.       ]
 [        nan  30.33717    17.          0.       ]
 [        nan  28.56717    20.          0.       ]
 [        nan  27.07717    22.          0.       ]
 [        nan  26.34717    24.          0.       ]
 [        nan  25.867168   24.          0.       ]
 [        nan  24.947168   26.          0.       ]
 [        nan  24.117168   27.          0.       ]
 [        nan  24.047169   28.          0.       ]
 [        nan  22.50717    30.          0.       ]
 [        nan  21.047169   33.          0.       ]
 [        nan  19.51717    37.          0.       ]
 [        nan  18.447168   40.          0.       ]
 [        nan  17.41717    44.          0.       ]
 [        nan  16.59717    50.          0.       ]
 [        nan  15.627169   57.          0.       ]
 [        nan  17.15717    52.          0.       ]
 [        nan  22.51717    36.          0.       ]
 [        nan  25.977169   30.          0.       ]
 [        nan  28.51717    26.          0.       ]
 [        nan  29.56717    24.          0.       ]
 [        nan  29.137169   19.          0.       ]
 [        nan  30.00717    17.          0.       ]
 [        nan  30.287169   16.          0.       ]
 [        nan  30.357168   16.          0.       ]
 [        nan  29.857168   16.          0.       ]
 [        nan  28.33717    18.          0.       ]
 [        nan  26.437168   20.          0.       ]
 [        nan  25.56717    21.          0.       ]
 [        nan  24.777168   22.          0.       ]
 [        nan  23.627169   24.          0.       ]
 [        nan  22.727169   27.          0.       ]]

As you can see, unfortunately, we can't place different types of variables into the same array. But other than that, the rest of the operations are similar:

In [2]:
data.shape
Out[2]:
(202, 4)
In [3]:
data_crop = data[11:,1:4]
In [4]:
data_crop
Out[4]:
array([[12.557169 , 63.       ,  0.       ],
       [13.177169 , 68.       ,  0.       ],
       [13.087169 , 76.       ,  0.       ],
       [12.867169 , 81.       ,  0.       ],
       [11.567169 , 88.       ,  0.       ],
       [11.177169 , 90.       ,  0.       ],
       [12.187169 , 87.       ,  0.       ],
       [13.797169 , 78.       ,  0.       ],
       [14.967169 , 72.       ,  0.       ],
       [16.787169 , 62.       ,  0.       ],
       [18.367168 , 55.       ,  0.       ],
       [20.957169 , 41.       ,  0.       ],
       [22.057169 , 36.       ,  0.       ],
       [22.857168 , 33.       ,  0.       ],
       [23.527168 , 29.       ,  0.       ],
       [23.40717  , 28.       ,  0.       ],
       [22.687168 , 28.       ,  0.       ],
       [21.41717  , 30.       ,  0.       ],
       [18.537169 , 35.       ,  0.       ],
       [16.797169 , 39.       ,  0.       ],
       [14.717169 , 44.       ,  0.       ],
       [12.627169 , 54.       ,  0.       ],
       [11.187169 , 64.       ,  0.       ],
       [ 9.9971695, 72.       ,  0.       ],
       [ 9.057169 , 80.       ,  0.       ],
       [ 8.357169 , 86.       ,  0.       ],
       [ 7.727169 , 89.       ,  0.       ],
       [ 7.307169 , 90.       ,  0.       ],
       [ 7.107169 , 92.       ,  0.       ],
       [ 7.177169 , 91.       ,  0.       ],
       [ 7.1271687, 90.       ,  0.       ],
       [ 9.537169 , 76.       ,  0.       ],
       [13.377169 , 64.       ,  0.       ],
       [16.73717  , 52.       ,  0.       ],
       [19.697168 , 38.       ,  0.       ],
       [21.047169 , 29.       ,  0.       ],
       [22.027168 , 26.       ,  0.       ],
       [22.787169 , 23.       ,  0.       ],
       [23.187168 , 22.       ,  0.       ],
       [23.037169 , 20.       ,  0.       ],
       [22.357168 , 21.       ,  0.       ],
       [20.83717  , 23.       ,  0.       ],
       [18.64717  , 26.       ,  0.       ],
       [17.67717  , 28.       ,  0.       ],
       [16.98717  , 29.       ,  0.       ],
       [16.087168 , 31.       ,  0.       ],
       [15.297169 , 33.       ,  0.       ],
       [13.867169 , 36.       ,  0.       ],
       [12.397169 , 41.       ,  0.       ],
       [11.107169 , 46.       ,  0.       ],
       [10.157169 , 51.       ,  0.       ],
       [10.587169 , 51.       ,  0.       ],
       [10.507169 , 50.       ,  0.       ],
       [10.147169 , 49.       ,  0.       ],
       [ 9.617169 , 47.       ,  0.       ],
       [11.397169 , 42.       ,  0.       ],
       [16.32717  , 31.       ,  0.       ],
       [20.09717  , 24.       ,  0.       ],
       [21.947168 , 21.       ,  0.       ],
       [23.217169 , 19.       ,  0.       ],
       [24.187168 , 18.       ,  0.       ],
       [24.887169 , 16.       ,  0.       ],
       [25.297169 , 16.       ,  0.       ],
       [25.16717  , 15.       ,  0.       ],
       [24.477169 , 16.       ,  0.       ],
       [22.58717  , 19.       ,  0.       ],
       [20.83717  , 21.       ,  0.       ],
       [19.797169 , 23.       ,  0.       ],
       [18.447168 , 25.       ,  0.       ],
       [16.707169 , 29.       ,  0.       ],
       [14.087169 , 35.       ,  0.       ],
       [11.627169 , 44.       ,  0.       ],
       [ 9.607169 , 57.       ,  0.       ],
       [ 8.857169 , 65.       ,  0.       ],
       [ 8.597169 , 70.       ,  0.       ],
       [10.287169 , 62.       ,  0.       ],
       [11.367169 , 55.       ,  0.       ],
       [11.787169 , 51.       ,  0.       ],
       [11.687169 , 50.       ,  0.       ],
       [12.857169 , 46.       ,  0.       ],
       [17.277168 , 35.       ,  0.       ],
       [20.51717  , 28.       ,  0.       ],
       [22.697168 , 25.       ,  0.       ],
       [24.217169 , 21.       ,  0.       ],
       [25.527168 , 18.       ,  0.       ],
       [26.48717  , 15.       ,  0.       ],
       [27.06717  , 14.       ,  0.       ],
       [27.06717  , 14.       ,  0.       ],
       [26.48717  , 15.       ,  0.       ],
       [24.357168 , 17.       ,  0.       ],
       [22.057169 , 20.       ,  0.       ],
       [20.31717  , 23.       ,  0.       ],
       [18.31717  , 27.       ,  0.       ],
       [15.907169 , 32.       ,  0.       ],
       [13.397169 , 38.       ,  0.       ],
       [11.437169 , 46.       ,  0.       ],
       [10.687169 , 50.       ,  0.       ],
       [10.317169 , 54.       ,  0.       ],
       [10.107169 , 55.       ,  0.       ],
       [11.477169 , 51.       ,  0.       ],
       [12.227169 , 48.       ,  0.       ],
       [12.567169 , 46.       ,  0.       ],
       [12.347169 , 46.       ,  0.       ],
       [13.457169 , 42.       ,  0.       ],
       [17.947168 , 32.       ,  0.       ],
       [21.64717  , 25.       ,  0.       ],
       [24.447168 , 21.       ,  0.       ],
       [27.00717  , 18.       ,  0.       ],
       [29.07717  , 15.       ,  0.       ],
       [30.67717  , 14.       ,  0.       ],
       [31.867168 , 13.       ,  0.       ],
       [31.697168 , 14.       ,  0.       ],
       [30.627169 , 15.       ,  0.       ],
       [28.06717  , 19.       ,  0.       ],
       [25.797169 , 23.       ,  0.       ],
       [22.367168 , 29.       ,  0.       ],
       [18.81717  , 36.       ,  0.       ],
       [16.58717  , 42.       ,  0.       ],
       [15.387169 , 45.       ,  0.       ],
       [14.857169 , 47.       ,  0.       ],
       [14.467169 , 49.       ,  0.       ],
       [14.337169 , 50.       ,  0.       ],
       [14.147169 , 51.       ,  0.       ],
       [15.477169 , 47.       ,  0.       ],
       [16.107168 , 46.       ,  0.       ],
       [16.297169 , 45.       ,  0.       ],
       [16.097168 , 46.       ,  0.       ],
       [17.227169 , 43.       ,  0.       ],
       [21.34717  , 33.       ,  0.       ],
       [25.057169 , 27.       ,  0.       ],
       [28.187168 , 22.       ,  0.       ],
       [30.57717  , 18.       ,  0.       ],
       [31.84717  , 16.       ,  0.       ],
       [32.27717  , 15.       ,  0.       ],
       [32.33717  , 15.       ,  0.       ],
       [31.947168 , 15.       ,  0.       ],
       [30.807169 , 16.       ,  0.       ],
       [28.65717  , 19.       ,  0.       ],
       [26.527168 , 22.       ,  0.       ],
       [22.297169 , 29.       ,  0.       ],
       [19.40717  , 35.       ,  0.       ],
       [17.01717  , 42.       ,  0.       ],
       [15.407169 , 48.       ,  0.       ],
       [14.817169 , 51.       ,  0.       ],
       [14.7471695, 51.       ,  0.       ],
       [14.847169 , 51.       ,  0.       ],
       [15.127169 , 49.       ,  0.       ],
       [16.49717  , 44.       ,  0.       ],
       [17.00717  , 42.       ,  0.       ],
       [17.16717  , 41.       ,  0.       ],
       [17.07717  , 42.       ,  0.       ],
       [18.32717  , 39.       ,  0.       ],
       [22.24717  , 31.       ,  0.       ],
       [25.92717  , 25.       ,  0.       ],
       [28.807169 , 20.       ,  0.       ],
       [30.807169 , 18.       ,  0.       ],
       [31.717169 , 17.       ,  0.       ],
       [32.057167 , 16.       ,  0.       ],
       [32.107166 , 15.       ,  0.       ],
       [31.39717  , 15.       ,  0.       ],
       [30.33717  , 17.       ,  0.       ],
       [28.56717  , 20.       ,  0.       ],
       [27.07717  , 22.       ,  0.       ],
       [26.34717  , 24.       ,  0.       ],
       [25.867168 , 24.       ,  0.       ],
       [24.947168 , 26.       ,  0.       ],
       [24.117168 , 27.       ,  0.       ],
       [24.047169 , 28.       ,  0.       ],
       [22.50717  , 30.       ,  0.       ],
       [21.047169 , 33.       ,  0.       ],
       [19.51717  , 37.       ,  0.       ],
       [18.447168 , 40.       ,  0.       ],
       [17.41717  , 44.       ,  0.       ],
       [16.59717  , 50.       ,  0.       ],
       [15.627169 , 57.       ,  0.       ],
       [17.15717  , 52.       ,  0.       ],
       [22.51717  , 36.       ,  0.       ],
       [25.977169 , 30.       ,  0.       ],
       [28.51717  , 26.       ,  0.       ],
       [29.56717  , 24.       ,  0.       ],
       [29.137169 , 19.       ,  0.       ],
       [30.00717  , 17.       ,  0.       ],
       [30.287169 , 16.       ,  0.       ],
       [30.357168 , 16.       ,  0.       ],
       [29.857168 , 16.       ,  0.       ],
       [28.33717  , 18.       ,  0.       ],
       [26.437168 , 20.       ,  0.       ],
       [25.56717  , 21.       ,  0.       ],
       [24.777168 , 22.       ,  0.       ],
       [23.627169 , 24.       ,  0.       ],
       [22.727169 , 27.       ,  0.       ]])
In [5]:
np.mean(data_crop[:,0])
Out[5]:
19.623451820418843
In [ ]: