Skip to content

Load Data

Older Approach

The purpose of this script is to, firstly read what the data source is (JPL, CSR or ITSG) read file path for GRACE L2 spherical harmonics inputs, read replacement files for tn13 and tn14 source of the SH files (JPL, ITSG or CSR)

read_GRACE_SH_paths(use_sample_files=0)

Returns path of data files, path of tn13 and path of tn14 replacement files

Parameters:

Name Type Description Default
use_sample_files int

description. Defaults to 0.

0

Raises:

Type Description
Exception

description

Returns:

Name Type Description
_type_

description

Source code in pyshbundle/read_GRACE_SH_paths.py
 57
 58
 59
 60
 61
 62
 63
 64
 65
 66
 67
 68
 69
 70
 71
 72
 73
 74
 75
 76
 77
 78
 79
 80
 81
 82
 83
 84
 85
 86
 87
 88
 89
 90
 91
 92
 93
 94
 95
 96
 97
 98
 99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
def read_GRACE_SH_paths(use_sample_files = 0):
    """Returns path of data files, path of tn13 and path of tn14 replacement files

    Args:
        use_sample_files (int, optional): _description_. Defaults to 0.

    Raises:
        Exception: _description_

    Returns:
        _type_: _description_
    """

    print("This program supports working with GRACE L2 Spherical harmonics data from the following centers: CSR, JPL and ITSG")
    print("Instructions to download data may be referred to in https://github.com/mn5hk/pyshbundle/blob/main/docs/index.md#how-to-download-data")
    source = str(input("Enter the source of L2 SH coeffs code(jpl, csr, gfz): "))

    if use_sample_files ==1:

        print("You have chosen to use sample replacement files.")
        print("The replacement files for the TN13 and TN14 parameters have been preloaded into the program")
        print("Due to the size of the GRACE SH files, these have not been preloaded into the program")
        print("You may download the GRACE SH L2 files from the link below. Please ensure to download the files as per your selection of source in the prior step")
        print("Download sample files from: https://github.com/mn5hk/pyshbundle/tree/main/sample_input_data")
    path_sh = str(input("Enter the path to the folder with SH L2 data"))


    if str.upper(source) == 'JPL':
        if use_sample_files == 1:
            path_tn13 = pkg_resources.resource_filename('pyshbundle', 'data/sample_JPL_TN_files/TN-13_GEOC_JPL_RL06.txt')
            path_tn14 = pkg_resources.resource_filename('pyshbundle', 'data/sample_JPL_TN_files/TN-14_C30_C20_GSFC_SLR.txt')
            print("Successfully loaded preloaded TN13 and TN14 replacement files for JPL")
        else:
            path_tn13 = str(input("Enter the path to the file for tn13 replacement in .txt format"))
            path_tn14 = str(input("Enter the path to the file for tn14 replacement in .txt format"))
            print("Successfully loaded TN13 and TN14 replacement files for JPL")

    elif str.upper(source) == 'CSR':
        if use_sample_files == 1:
            path_tn13 = pkg_resources.resource_filename('pyshbundle', 'data/sample_CSR_TN_files/TN-14_C30_C20_SLR_GSFC.txt')
            path_tn14 = pkg_resources.resource_filename('pyshbundle', 'data/sample_CSR_TN_files/TN-13_GEOC_CSR_RL06.1.txt')
            print("Successfully loaded preloaded TN13 and TN14 replacement files for CSR")
        else:
            path_tn13 = str(input("Enter the path to the file for tn13 replacement in .txt format"))
            path_tn14 = str(input("Enter the path to the file for tn14 replacement in .txt format"))
            print("Successfully loaded TN13 and TN14 replacement files for CSR")

    elif str.upper(source) == 'ITSG':
        if use_sample_files == 1:
            path_tn13 = pkg_resources.resource_filename('pyshbundle', 'data/sample_ITSG_TN_files/TN-13_GEOC_CSR_RL06.1.txt')
            path_tn14 = pkg_resources.resource_filename('pyshbundle', 'data/sample_ITSG_TN_files/TN-14_C30_C20_SLR_GSFC.txt')
            print("Successfully loaded preloaded TN13 and TN14 replacement files for ITSG")
        else:
            path_tn13 = str(input("Enter the path to the file for tn13 replacement in .txt format"))
            path_tn14 = str(input("Enter the path to the file for tn14 replacement in .txt format"))
            print("Successfully loaded TN13 and TN14 replacement files for ITSG")
    else:
        raise Exception("Source selection is incorrect. Please select between JPL, CSR or gfz")

    return path_sh, path_tn13, path_tn14, source

Replaces the certain coefficients with other given coeff.

Parameters:

Name Type Description Default
path str

description

required
path_tn14 str

description

required
path_tn13 str

description

required

Returns:

Name Type Description
_type_

description

Source code in pyshbundle/reader_replacer_jpl.py
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
def reader_replacer_jpl(path, path_tn14, path_tn13):
    """Replaces the certain coefficients with other given coeff.

    Args:
        path (str): _description_
        path_tn14 (str): _description_
        path_tn13 (str): _description_

    Returns:
        _type_: _description_
    """
    file_list = os.listdir(path)

    filenames = os.listdir(path)  # Names of files in folder
    # Identify the data product source
    if 'GFZ' in str(filenames[0]):
        source = str('GFZ')
    elif 'CSR' in str(filenames[0]):
        source = str('CSR')
    elif 'JPL' in str(filenames[0]):
        source = str('JPL')
    elif 'ITSG' in str(filenames[0]):
        source = str('ITSG')
    print(source)
    aa = sorted(file_list, key=last_4chars)  # when does your data start at ?
    year_start = aa[0][6:10]
    time_axes = 0

    # Counts the number of files
    no_of_files = 0

    # Empty numpy arrays to store variables
    # Empty numpy arrays to store variables
    line_num = 0
    oi = 22
    degree = [[] for x in range(oi)]
    order = [[] for x in range(oi)]
    clm = [[] for x in range(oi)]
    slm = [[] for x in range(oi)]
    slm_std = [[] for x in range(oi)]
    clm_std = [[] for x in range(oi)]
    start_date = [[] for x in range(oi)]
    end_date = [[] for x in range(oi)]

    # Iterate through all files
    for file in sorted(file_list, key=last_4chars):
        # print(file)
        if file.endswith(".gz"):
            file_name = f"{path}/{file}"
            # print(file_name[-39:-32])

            # Save yearwise
            year_start, time_axes = TIME(year_start, file_name, time_axes)

            # Call the function 'reader'
            reader(file_name, line_num, degree, order, clm, slm, slm_std,
                   clm_std, start_date, end_date, year_start, time_axes)
            no_of_files = no_of_files + 1
    print('Reading into clm format complete!')
    print("Number of files read:", no_of_files)

    Lmax = degree[0][-1]
    degree_order = int((Lmax+1) * (Lmax+2)/2)

    # Replacement
    print('Starting replacement')

    # Replace deg 2,3
    new_file_TN14 = path_tn14

    rep_start_date, rep_end_date, c20, c30, c20_sigma, c30_sigma = [], [], [], [], [], []
    with open(new_file_TN14, "r", encoding='utf8') as file:
        stuff = file.readlines()
        for i in range(0, len(stuff), 1):
            line_num = str('not found')
            if stuff[i] == str('Product:\n'):
                print("found:", i+1)
                line_num = i + 1
                break
        if type(line_num) is str:
            print('Replacement data not found')
        pattern = '\s+'
        count = 0
        while (line_num < len(stuff)):
            split = re.split(pattern, str(stuff[line_num]))
            c20.append(float(split[2]))
            c30.append(float(split[5]))
            c20_sigma.append(float(split[4])*1e-10)
            c30_sigma.append(float(split[7])*1e-10)
            rep_start_date.append(
                str(julian.from_jd(float(split[0]), fmt='mjd').date()))
            rep_end_date.append(
                str(julian.from_jd(float(split[8]), fmt='mjd').date()))
            line_num = line_num + 1
            count = count + 1

    # Actual replacement

    index = 0
    for year in range(0, time_axes+1, 1):
        for y in range(0, round(len(degree[year])/int(degree_order-3)), 1):
            while(index < len(c20)):

                curr_date = start_date[year][y*degree_order]
                rep_date_cmp = rep_start_date[index]

                if curr_date == rep_date_cmp:
                    print(curr_date, rep_date_cmp, index)
                    clm[year][y*int(degree_order-3)] = c20[index]
                    clm_std[year][y*int(degree_order-3)] = c20_sigma[index]
                    if math.isnan(c30[index]) == False:
                        clm[year][y*int(degree_order-3)+3] = c30[index]
                        clm_std[year][y*int(degree_order-3) +
                                      3] = c30_sigma[index]
                    index = index + 1
                    break
                else:
                    index = index + 1
    print('Degree 2,3 replacement complete!')

    ''' Replace deg 1 '''
    new_file_TN13 = path_tn13
    rep_start_date_deg1, rep_end_date_deg1, c1m, s1m, c1m_sigma, s1m_sigma = [], [], [], [], [], []
    with open(new_file_TN13, "r", encoding='utf8') as file:
        stuff = file.readlines()
        for i in range(0, len(stuff), 1):
            if stuff[i] == str('end of header ===============================================================================\n'):
                print("found:", i+1)
                line_num = i + 1
                break
            else:
                line_num = str('not found')
        pattern = '\s+'
        count = 0
        while (line_num < len(stuff)):
            split = re.split(pattern, str(stuff[line_num]))
            c1m.append(float(split[3]))
            s1m.append(float(split[4]))
            c1m_sigma.append(float(split[5])*1e-10)
            s1m_sigma.append(float(split[6])*1e-10)
            rep_start_date_deg1.append(
                split[7][0:4] + '-'+split[7][4:6] + '-'+split[7][6:8])
            rep_end_date_deg1.append(
                split[8][0:4] + '-'+split[8][4:6] + '-'+split[8][6:8])
            line_num = line_num + 1
            count = count + 1

    # replace deg 1
    index = 0
    for year in range(0, time_axes+1, 1):
        for y in range(0, int(len(degree[year])/int(degree_order-3)), 1):
            while(index < len(c1m)):

                curr_date_deg1 = start_date[year][y*int(degree_order-1)]
                rep_date_cmp_deg1 = rep_start_date_deg1[index]

                if curr_date_deg1 == rep_date_cmp_deg1:
                    print(curr_date_deg1, rep_date_cmp_deg1, index)
                    degree[year].insert(y*int(degree_order-1), int(1))
                    degree[year].insert(y*int(degree_order-1)+1, int(1))
                    order[year].insert(y*int(degree_order-1), int(0))
                    order[year].insert(y*int(degree_order-1)+1, int(1))
                    clm[year].insert(y*int(degree_order-1), float(c1m[index]))
                    clm[year].insert(y*int(degree_order-1) +
                                     1, float(c1m[index+1]))
                    slm[year].insert(y*int(degree_order-1), float(s1m[index]))
                    slm[year].insert(y*int(degree_order-1) +
                                     1, float(s1m[index+1]))
                    clm_std[year].insert(
                        y*int(degree_order-1), float(c1m_sigma[index]))
                    clm_std[year].insert(
                        y*int(degree_order-1)+1, float(c1m_sigma[index+1]))
                    slm_std[year].insert(
                        y*int(degree_order-1), float(s1m_sigma[index]))
                    slm_std[year].insert(
                        y*int(degree_order-1)+1, float(s1m_sigma[index+1]))
                    start_date[year].insert(
                        y*int(degree_order-1), rep_start_date_deg1[index])
                    start_date[year].insert(
                        y*int(degree_order-1)+1, rep_start_date_deg1[index+1])
                    end_date[year].insert(
                        y*int(degree_order-1), rep_end_date_deg1[index])
                    end_date[year].insert(
                        y*int(degree_order-1)+1, rep_end_date_deg1[index+1])
                    index = index + 2
                    break
                else:
                    index = index + 2
    print('Degree 1 replacement complete!')

    ''' Add degree 0 coeffs (zeros) '''
    for year in range(0, time_axes+1, 1):
        for y in range(0, int(len(degree[year])/int(degree_order-1)), 1):
            degree[year].insert(y*degree_order, float(0))
            order[year].insert(y*degree_order, float(0))
            clm[year].insert(y*degree_order, float(0))
            slm[year].insert(y*degree_order, float(0))
            clm_std[year].insert(y*degree_order, float(0))
            slm_std[year].insert(y*degree_order, float(0))
            start_date[year].insert(
                y*degree_order, start_date[year][y*degree_order+1])
            end_date[year].insert(
                y*degree_order, end_date[year][y*degree_order+1])

    ''' Save everything in a list '''
    saved_as_num = [degree, order, clm, slm,
                    clm_std, slm_std, start_date, end_date]

    ''' Number of years '''
    beta = np.zeros(len(start_date))
    sum = 0
    for x in range(0, time_axes+1, 1):
        beta[x] = round(len(start_date[x])/degree_order)
        sum = sum + beta[x]

    ''' Finding the dates for time bounds of data '''
    dates_start, dates_end = [], []
    for i in range(0, time_axes+1, 1):
        j = 0
        while j < beta[i]:
            dates_start.append(str(start_date[i][j*degree_order]))
            dates_end.append(str(end_date[i][j*degree_order]))
            j = j + 1
    print("Number of months of data in each year starting", dates_start[0],
          "& ending", dates_end[-1], beta)

    # import pickle
    # with open("/home/wslvivek/Desktop/level2/pysh_v2/output/saved_as_num","wb") as pk:
    #     pickle.dump(saved_as_num, pk)

    return saved_as_num, dates_start, dates_end, no_of_files
Source code in pyshbundle/reader_replacer_csr.py
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
def reader_replacer_csr(path, path_tn14, path_tn13):

    file_list = os.listdir(path)

    filenames = os.listdir(path) #Names of files in folder
    # Identify the data product source
    if 'GFZ' in str(filenames[1]):
        source = str('GFZ')  
    elif 'CSR' in str(filenames[1]):
        source = str('CSR')
    elif 'JPL' in str(filenames[1]):
        source = str('JPL')
    elif 'ITSG' in str(filenames[0]):
        source = str('ITSG')
    print(source)
    aa = sorted(file_list, key = last_4chars)  #when does your data start at ?
    year_start = aa[0][6:10]
    time_axes = 0

    # Counts the number of files
    no_of_files = 0

    # Empty numpy arrays to store variables
    # Empty numpy arrays to store variables
    line_num = 0
    oi = 22
    degree = [[] for x in range(oi)]
    order = [[] for x in range(oi)]
    clm = [[] for x in range(oi)]
    slm = [[] for x in range(oi)]
    clm_std = [[] for x in range(oi)]
    slm_std = [[] for x in range(oi)]
    start_date = [[] for x in range(oi)]
    end_date = [[] for x in range(oi)]

    # Iterate through all files
    for file in sorted(file_list, key = last_4chars):
        #print(file)
        if file.endswith(".gz"):
            file_name = f"{path}/{file}"
            # print(file_name[-39:-32])

            # Save yearwise
            year_start, time_axes = TIME(year_start,file_name,time_axes)

            # Call the function 'reader'
            reader(file_name,line_num,degree,order,clm,slm,clm_std,slm_std,start_date,end_date,year_start,time_axes)
            no_of_files = no_of_files + 1

    print('Reading into klm format complete!')
    print("Number of files read:", no_of_files)

    Lmax=degree[0][-1]
    degree_order=int((Lmax+1)*(Lmax+2)/2)
    #''' Replacement '''
    print('Starting replacement')
   # ''' Replace deg 2,3 '''
    new_file_TN14 = path_tn14
    rep_start_date, rep_end_date, c20, c30, c20_sigma, c30_sigma = [], [], [], [], [], []
    with open(new_file_TN14,"r", encoding='utf8') as file:
        stuff = file.readlines()
        for i in range(0,len(stuff),1):
            line_num = str('not found')
            if  stuff[i] == str('Product:\n'):
                print("found:",i+1)
                line_num = i + 1
                break
        if type(line_num) is str:
            print('Replacement data not found')
        pattern = '\s+'
        count = 0
        while (line_num<len(stuff)):
            split = re.split(pattern, str(stuff[line_num]))
            c20.append( float(split [2]) )
            c30.append( float(split[5]) )
            c20_sigma.append(float(split[4])*1e-10)
            c30_sigma.append(float(split[7])*1e-10)
            rep_start_date.append(str(julian.from_jd(float(split[0]), fmt='mjd').date()))
            rep_end_date.append(str(julian.from_jd(float(split[8]), fmt='mjd').date()))
            line_num = line_num + 1
            count = count + 1

    # Actual replacement    

    index = 0
    margin=datetime.timedelta(days = 5)
    for year in range(0,time_axes+1,1):
        # index = 0
        for y in range(0,round(len(clm[year])/degree_order),1):    
            while(index<len(c20)):

                curr_date = datetime.datetime.strptime(start_date[year][y*degree_order], '%Y-%m-%d')
                rep_date_cmp = datetime.datetime.strptime(rep_start_date[index], '%Y-%m-%d')

                if  rep_date_cmp-margin <= curr_date <= rep_date_cmp+margin:
                    print(curr_date, rep_date_cmp, index)
                    clm[year][y*degree_order+2] = c20[index]
                    clm_std[year][y*degree_order+2] = c20_sigma[index]
                    index = index +1
                    if math.isnan(c30[index]) == False:
                        clm[year][y*degree_order+3] = c30[index]
                        # NOTE: rectified from slm_dev -> clm_dev
                        clm_std[year][y*degree_order+3] = c30_sigma[index]
                    break
                else:   
                    index = index +1                           
    print('Degree 2,3 replacement complete!')

    #''' Replace deg 1 '''
    new_file_TN13 = path_tn13  
    rep_start_date_deg1, rep_end_date_deg1, c1m, s1m, c1m_sigma, s1m_sigma = [], [], [], [], [], []
    with open(new_file_TN13,"r", encoding='utf8') as file:
        stuff = file.readlines()
        for i in range(0,len(stuff),1):
            if  stuff[i] == str('end of header ===============================================================================\n'):
                print("found:",i+1)
                line_num = i + 1
                break
            else:
                line_num = str('not found')
        pattern = '\s+'
        count = 0
        while (line_num<len(stuff)):
            split = re.split(pattern, str(stuff[line_num]))
            c1m.append( float(split [3]) )
            s1m.append( float(split[4]) )
            c1m_sigma.append( float(split[5])*1e-10)
            s1m_sigma.append( float(split[6])*1e-10)
            rep_start_date_deg1.append(split[7][0:4] +'-'+split[7][4:6] +'-'+split[7][6:8])
            rep_end_date_deg1.append(split[8][0:4] +'-'+split[8][4:6] +'-'+split[8][6:8])
            line_num = line_num + 1
            count = count + 1 

    # replace deg 1   
    index = 0
    Lmax = degree[0][-1]
    margin_deg1=datetime.timedelta(days = 5)
    for year in range(0,time_axes+1,1):
        for y in range(0,int(len(clm[year])/degree_order),1):    
            while(index<len(c1m)):
                curr_date_deg1 = datetime.datetime.strptime(start_date[year][y*degree_order], '%Y-%m-%d')
                rep_date_cmp_deg1 = datetime.datetime.strptime(rep_start_date_deg1[index], '%Y-%m-%d')

                if  rep_date_cmp_deg1-margin_deg1 <= curr_date_deg1 <= rep_date_cmp_deg1+margin_deg1:
                    print(curr_date_deg1, rep_date_cmp_deg1, index)
                    # degree[year][y*degree_order+1] = int(1)
                    # degree[year][y*degree_order+1+Lmax] = int(1)
                    # order[year][y*degree_order+1] = int(0)
                    # order[year][y*degree_order+1+Lmax] = int(1)
                    clm[year][y*degree_order+1] = c1m[index]
                    clm[year][y*degree_order+1+Lmax] = c1m[index+1]
                    slm[year][y*degree_order+1] = s1m[index]
                    slm[year][y*degree_order+1+Lmax] = s1m[index+1]
                    clm_std[year][y*degree_order+1] = c1m_sigma[index]
                    clm_std[year][y*degree_order+1+Lmax] = c1m_sigma[index+1]
                    slm_std[year][y*degree_order+1] = s1m_sigma[index]
                    slm_std[year][y*degree_order+1+Lmax] = s1m_sigma[index+1]

                    index = index +2
                    break
                else:
                    index = index +2
    print('Degree 1 replacement complete!') 

    #''' Save everything in a list '''
    saved_as_num=[degree,order,clm,slm,clm_std,slm_std,start_date,end_date]

    #''' Number of years '''
    beta = np.zeros(len(start_date))
    sum = 0
    for x in range(0,time_axes+1,1):
        beta[x] = (len(start_date[x])/degree_order)
        sum = sum + len(start_date[x])/degree_order 

    #''' Finding the dates dates_csrfor time bounds of data '''
    dates_start, dates_end = [],[] 
    for i in range(0,time_axes+1,1):
        j = 0
        while j < beta[i]:
            dates_start.append(str(start_date[i][j*degree_order]))
            dates_end.append(str(end_date[i][j*degree_order]))
            j = j + 1
    print("Number of months of data in each year starting", dates_start[0], \
          "& ending", dates_end[-1], beta)   

    #import pickle
    #with open("/home/wslvivek/Desktop/level2/pysh_v2/output/saved_as_num","wb") as pk:
    #    pickle.dump(saved_as_num, pk)
    return saved_as_num, dates_start,dates_end, no_of_files

summary

Parameters:

Name Type Description Default
path _type_

description

required
path_tn14 _type_

description

required
path_tn13 _type_

description

required

Returns:

Name Type Description
_type_

description

Source code in pyshbundle/reader_replacer_itsg.py
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
def reader_replacer_itsg(path, path_tn14, path_tn13):
    """_summary_

    Args:
        path (_type_): _description_
        path_tn14 (_type_): _description_
        path_tn13 (_type_): _description_

    Returns:
        _type_: _description_
    """
    file_list = os.listdir(path)

    filenames = os.listdir(path)       #Names of files in folder    
    # Identify the data product source
    if 'GFZ' in str(filenames[0]):
        source = str('GFZ')
    elif 'CSR' in str(filenames[0]):
        source = str('CSR')
    elif 'JPL' in str(filenames[0]):
        source = str('JPL')
    elif 'ITSG' in str(filenames[0]):
        source = str('ITSG')
    print(source)
    aa = sorted(file_list, key = last_4chars) #when does your data start at ?
    year_start = aa[0][-11:-7]
    time_axes = 0

    # Counts the number of files
    no_of_files = 0

    # Empty numpy arrays to store variables
    # Empty numpy arrays to store variables
    line_num = 0
    oi = 22
    degree = [[] for x in range(oi)]
    order = [[] for x in range(oi)]
    clm = [[] for x in range(oi)]
    slm = [[] for x in range(oi)]
    clm_std = [[] for x in range(oi)]
    slm_std = [[] for x in range(oi)]
    start_date = [[] for x in range(oi)]

    # Iterate through all files
    for file in sorted(file_list, key = last_4chars):
        #print(file)
        if file.endswith(".gfc"):
            file_name = f"{path}/{file}"
            #print(file_name[-39:-32])

            # Save yearwise
            year_start, time_axes = TIME(year_start,file_name,time_axes)

            # Call the function 'reader'
            reader(file_name,line_num,degree,order,clm,slm,clm_std,slm_std,start_date,year_start,time_axes)
            no_of_files = no_of_files + 1
    print('Reading into clm format complete!')
    print("Number of files read:", no_of_files)

    Lmax=degree[0][-1]
    degree_order=int((Lmax+1)*(Lmax+2)/2)

    ''' Replacement '''

    print('Starting replacement')
    ''' Replace deg 2,3 '''
    new_file_TN14 = path_tn14

    rep_start_date, rep_end_date, c20, c30, c20_sigma, c30_sigma = [], [], [], [], [], []
    with open(new_file_TN14,"r", encoding='utf8') as file:
        stuff = file.readlines()
        for i in range(0,len(stuff),1):
            line_num = str('not found')
            if  stuff[i] == str('Product:\n'):
                print("found:",i+1)
                line_num = i + 1
                break
        if type(line_num) is str:
            print('Replacement data not found')
        pattern = '\s+'
        count = 0
        while (line_num<len(stuff)):
            split = re.split(pattern, str(stuff[line_num]))
            c20.append( float(split [2]) )
            c30.append( float(split[5]) )
            c20_sigma.append(float(split[4])*1e-10)
            c30_sigma.append(float(split[7])*1e-10)
            rep_start_date.append(str(julian.from_jd(float(split[0]), fmt='mjd').date()))
            rep_end_date.append(str(julian.from_jd(float(split[8]), fmt='mjd').date()))
            line_num = line_num + 1
            count = count + 1

    # Actual replacement    

    index = 0
    margin=datetime.timedelta(days = 23)
    for year in range(0,time_axes+1,1):
        for y in range(0,int(len(clm[year])/degree_order),1):    
            while(index<len(c20)):

                curr_date = datetime.datetime.strptime(start_date[year][y*degree_order], '%Y-%m')
                rep_date_cmp = datetime.datetime.strptime(rep_start_date[index], '%Y-%m-%d')

                if rep_date_cmp-margin <= curr_date <= rep_date_cmp+margin:
                    print(f"Data date - {curr_date}, replacemebt date (tn-14) = {rep_date_cmp}")
                    clm[year][y*degree_order+3] = c20[index]
                    clm_std[year][y*degree_order+3] = c20_sigma[index]
                    if math.isnan(c30[index]) == False:
                        clm[year][y*degree_order+6] = c30[index]
                        clm_std[year][y*degree_order+6] = c30_sigma[index]
                    index= index + 1
                    break
                else:   
                    index = index +1

    print('Degree 2,3 replacement complete!')


    ''' Replace deg 1 '''
    new_file_TN13 = path_tn13
    rep_start_date_deg1, rep_end_date_deg1, c1m, s1m, c1m_sigma, s1m_sigma = [], [], [], [], [], []
    with open(new_file_TN13,"r", encoding='utf8') as file:
        stuff = file.readlines()
        for i in range(0,len(stuff),1):
            if  stuff[i] == str('end of header ===============================================================================\n'):
                print("found:",i+1)
                line_num = i + 1
                break
            else:
                line_num = str('not found')
        pattern = '\s+'
        count = 0
        while (line_num<len(stuff)):
            split = re.split(pattern, str(stuff[line_num]))
            c1m.append( float(split [3]) )
            s1m.append( float(split[4]) )
            c1m_sigma.append( float(split [5]) )
            s1m_sigma.append( float(split[6]) )
            rep_start_date_deg1.append(split[7][0:4] +'-'+split[7][4:6] +'-'+split[7][6:8])
            rep_end_date_deg1.append(split[8][0:4] +'-'+split[8][4:6] +'-'+split[8][6:8])
            line_num = line_num + 1
            count = count + 1 

    # replace deg 1   
    index = 0
    margin=datetime.timedelta(days = 23)
    for year in range(0,time_axes+1,1):
        for y in range(0,int(len(clm[year])/degree_order),1):    
            while(index<len(c1m)):

                curr_date = datetime.datetime.strptime(start_date[year][y*degree_order], '%Y-%m')
                rep_date_cmp_deg1 = datetime.datetime.strptime(rep_start_date_deg1[index], '%Y-%m-%d')

                if rep_date_cmp_deg1-margin <= curr_date <=rep_date_cmp_deg1+margin:
                    print(f"Data date - {curr_date}, replacemebt date (tn-13) = {rep_date_cmp_deg1}")
                    degree[year][y*degree_order+1]=int(0)
                    degree[year][y*degree_order+2]=int(1)
                    order[year][y*degree_order+1]=int(0)
                    order[year][y*degree_order+2]=int(1)
                    clm[year][y*degree_order+1]=float(c1m[index])
                    clm[year][y*degree_order+2]=float(c1m[index+1])
                    slm[year][y*degree_order+1]=float(s1m[index])
                    slm[year][y*degree_order+2]=float(s1m[index+1])
                    clm_std[year][y*degree_order+1]=float(c1m_sigma[index])
                    clm_std[year][y*degree_order+2]=float(c1m_sigma[index+1])
                    slm_std[year][y*degree_order+1]=float(s1m_sigma[index])
                    slm_std[year][y*degree_order+2]=float(s1m_sigma[index+1])
                    start_date[year][y*degree_order+1]=rep_start_date_deg1[index]
                    start_date[year][y*degree_order+2]=rep_start_date_deg1[index+1]
                    index = index +2
                    break
                else:
                    index = index +2
    print('Degree 1 replacement complete!')

    ''' Save everything in a list '''
    saved_as_num=[degree,order,clm,slm,clm_std,slm_std,start_date]
    # import pickle
    # with open("/home/wslvivek/Desktop/level2/pysh_v2/output/saved_as_num","wb") as pk:
    #     pickle.dump(saved_as_num, pk)


    ''' Number of years '''
    beta = np.zeros(len(start_date))
    sum = 0
    for x in range(0,time_axes+1,1):
        beta[x] = round(len(start_date[x])/degree_order)
        sum = sum + beta[x]

    ''' Finding the dates for time bounds of data '''
    dates_start = []
    for i in range(0,time_axes+1,1):
        j = 0
        while j < beta[i]:
            dates_start.append(str(start_date[i][j*degree_order]))
            # dates_end.append(str(end_date[i][j*int(degree_order-3)]))
            j = j + 1
    print("Number of months of data in each year starting", dates_start[0], beta) #dates_end[-1], beta)       
    return saved_as_num, dates_start, no_of_files

Function to read files & extract data

Parameters:

Name Type Description Default
file_name _type_

description

required
line_num _type_

description

required
degree _type_

description

required
order _type_

description

required
clm _type_

description

required
slm _type_

description

required
delta_clm _type_

description

required
delta_slm _type_

description

required
start_date _type_

description

required
end_date _type_

description

required
year_start _type_

description

required
time_axes _type_

description

required
Source code in pyshbundle/reader_replacer.py
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
def reader(file_name: str,line_num, degree: int, order: int, clm,slm,delta_clm,delta_slm,start_date,end_date,year_start,time_axes):
    """Function to read files & extract data

    Args:
        file_name (_type_): _description_
        line_num (_type_): _description_
        degree (_type_): _description_
        order (_type_): _description_
        clm (_type_): _description_
        slm (_type_): _description_
        delta_clm (_type_): _description_
        delta_slm (_type_): _description_
        start_date (_type_): _description_
        end_date (_type_): _description_
        year_start (_type_): _description_
        time_axes (_type_): _description_
    """
    with gzip.open(file_name,"r") as file:
        stuff = file.readlines()
        stuff
        for i in range (0,len(stuff),1):
            #print(str(line))
            if str(stuff[i]) == str( b'# End of YAML header\n',):
                #print("found:",i+1)
                line_num = i+1                                                 #Line number of starting line of data
                break
        pattern = '\s+'                                                        #Delimiter for splitting                                                          
        while(line_num<len(stuff)):
            split = re.split(  pattern, str(stuff[line_num]))                  #split wordwise with Regex
            degree[time_axes].append(  int(split[1])  )
            order[time_axes].append( int(split [2])    )
            clm[time_axes].append( float(split [3])    )
            slm[time_axes].append( float(split [4])    )
            delta_clm[time_axes].append( float(split [5])    )
            delta_slm[time_axes].append(  float(split[6])    )
            start_date[time_axes].append(  split[7][0:4]+'-'+split[7][4:6]+'-'+split[7][6:8]  )
            end_date[time_axes].append(  split[8][0:4]+'-'+split[8][4:6]+'-'+split[8][6:8]  )
            line_num = line_num + 1

Newer Approach

Reads the spherical harmonic data provided by JPL

Parameters:

Name Type Description Default
file_path str

Absolute path to the file

required

Returns:

Name Type Description
dict

Header info in a structured dictionary

np.ndarray: SH coefficient data in form of a numpy n-dim array. Format is CLM.

set

Epoch start and end times as per the GRACE datafile. (eg. 20020519.0)

Source code in pyshbundle/io.py
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
def read_jpl(file_path: str):
    """Reads the spherical harmonic data provided by JPL

    Args:
        file_path (str): Absolute path to the file

    Returns:
        dict: Header info in a structured dictionary
        np.ndarray: SH coefficient data in form of a numpy n-dim array. Format is CLM.
        set: Epoch start and end times as per the GRACE datafile. (eg. 20020519.0)

    """
    # ensure that the file path is valid then proceed

    source = 'JPL'

    # check if the file is ziped or not

    if file_path[-3:] == '.gz':
        # open the file and read the lines
        with gzip.open(file_path, 'r') as file:

            # read the file line wise -> obtain a list of bytes
            info_lines = file.readlines()
            num_lines = len(info_lines)

            for i in range(len(info_lines)):
                # find the end of header sentence in the text file
                if str(info_lines[i]) == str(b'# End of YAML header\n',):
                    end_of_header_idx = i
                    break

        # everything after the header is the numerical data
        header_info = info_lines[:end_of_header_idx]
        jpl_data = info_lines[end_of_header_idx+1:]

        # parse the header strings to extract relavant metadata info
        jpl_header = parse_jpl_header(header_info)

        # call the parse data function and create a dictionary or matrix
        clm_mat, start_date, end_date = parse_jpl_data(jpl_data)

        # build a dictionary

        # Organize the data into either matrix, dataframe or dictionary format

    return jpl_header, clm_mat, start_date, end_date

Reads the spherical harmonic data provided by CSR

Parameters:

Name Type Description Default
file_path str

Absolute path to the GRACE data file from CSR

required

Returns:

Name Type Description
dict

Header info in a structured dictionary

np.ndarray: SH coefficient data in form of a numpy n-dim array. Format is KLM

set

Epoch begin and stop dates stored as float (eg. 20020519.0)

Source code in pyshbundle/io.py
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
def read_csr(file_path: str):
    """Reads the spherical harmonic data provided by CSR

    Args:
        file_path (str): Absolute path to the GRACE data file from CSR

    Returns:
        dict: Header info in a structured dictionary
        np.ndarray: SH coefficient data in form of a numpy n-dim array. Format is KLM
        set: Epoch begin and stop dates stored as float (eg. 20020519.0)
    """
    # ensure that the file path is valid then proceed

    source = 'CSR'

    # check if the file is ziped or not
    if file_path[-3:] == '.gz':
        # open the file and read the lines
        with gzip.open(file_path, 'r') as file:

            # read the file line wise -> obtain a list of bytes
            info_lines = file.readlines()
            num_lines = len(info_lines)

            for i in range(len(info_lines)):
                # find the index of line which indicates end of header info
                if str(info_lines[i]) == str(b'# End of YAML header\n',):
                    end_of_header_idx = i
                    break

        header_info = info_lines[:end_of_header_idx]
        # everything below header info is the sought after numeric data
        csr_data = info_lines[end_of_header_idx+1:]

        # parse the header strings to extract relavant metadata info
        csr_header = parse_jpl_header(header_info)

        # parse the data strings to extract numeric data in suitable matrix fmt
        klm_mat, start_time, stop_time = parse_csr_data(csr_data)

        # Organize the data into either matrix, dataframe or dictionary format

    return csr_header, klm_mat, start_time, stop_time

Reads the spherical harmonic data provided by CSR

Parameters:

Name Type Description Default
file_path str

Absolute path to the GRACE data file from CSR

required

Returns:

Name Type Description
dict

Header info in a structured dictionary

np.ndarray: SH coefficient data in form of a numpy n-dim array. Format is KLM

set

Epoch begin and stop dates stored as float (eg. 20020519.0)

Source code in pyshbundle/io.py
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
def read_csr(file_path: str):
    """Reads the spherical harmonic data provided by CSR

    Args:
        file_path (str): Absolute path to the GRACE data file from CSR

    Returns:
        dict: Header info in a structured dictionary
        np.ndarray: SH coefficient data in form of a numpy n-dim array. Format is KLM
        set: Epoch begin and stop dates stored as float (eg. 20020519.0)
    """
    # ensure that the file path is valid then proceed

    source = 'CSR'

    # check if the file is ziped or not
    if file_path[-3:] == '.gz':
        # open the file and read the lines
        with gzip.open(file_path, 'r') as file:

            # read the file line wise -> obtain a list of bytes
            info_lines = file.readlines()
            num_lines = len(info_lines)

            for i in range(len(info_lines)):
                # find the index of line which indicates end of header info
                if str(info_lines[i]) == str(b'# End of YAML header\n',):
                    end_of_header_idx = i
                    break

        header_info = info_lines[:end_of_header_idx]
        # everything below header info is the sought after numeric data
        csr_data = info_lines[end_of_header_idx+1:]

        # parse the header strings to extract relavant metadata info
        csr_header = parse_jpl_header(header_info)

        # parse the data strings to extract numeric data in suitable matrix fmt
        klm_mat, start_time, stop_time = parse_csr_data(csr_data)

        # Organize the data into either matrix, dataframe or dictionary format

    return csr_header, klm_mat, start_time, stop_time

summary

Parameters:

Name Type Description Default
file_path _type_

description

required

Returns:

Name Type Description
_type_

description

Source code in pyshbundle/io.py
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
def read_tn13(file_path):
    """_summary_

    Args:
        file_path (_type_): _description_

    Returns:
        _type_: _description_
    """

    # check if file exists at given path

    # check the file format - txt of zipped

    if file_path[-4:] == '.txt':
        with open(file_path, 'r', encoding='utf8') as file:

            # read the file line wise -> obtain a list of bytes
            info_lines = file.readlines()
            num_lines = len(info_lines)

            # find the end of header
            for i in range(len(info_lines)):
                if info_lines[i] == 'end of header ===============================================================================\n':
                    end_of_header_idx = i
                    break

        header_info = info_lines[:end_of_header_idx]
        tn13_data = info_lines[end_of_header_idx+1:]

        # call the parse header info functon and create a metadata dictionary

        # call the parse data function and create a dictionary or matrix
        tn13_clm_mat = parse_tn13_data(tn13_data)

        # Organize the data into either matrix, dataframe or dictionary format

    return tn13_clm_mat

summary

Parameters:

Name Type Description Default
file_path _type_

description

required

Returns:

Name Type Description
_type_

description

Source code in pyshbundle/io.py
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
def read_tn14(file_path):
    """_summary_

    Args:
        file_path (_type_): _description_

    Returns:
        _type_: _description_
    """

    # check the file extension - '.txt' or some zipped format

    # read the data into a list
    if file_path[-4:] == '.txt':
        with open(file_path, 'r', encoding='utf8') as file:

            # read the file line wise -> obtain a list of bytes
            info_lines = file.readlines()

            # find the end of header
            for i in range(len(info_lines)):
                if info_lines[i] == 'Product:\n':
                    end_of_header_idx = i
                    break

        header_info = info_lines[:end_of_header_idx]
        tn14_data = info_lines[end_of_header_idx+1:]

        # parse the data
        tn14_raplacement_mat = parse_tn14_data(tn14_data)

    return tn14_raplacement_mat

Computing Ling Term Mean

The purpose of this script is to, load longterm mean for our GRACE SH data

For this, we need to input the GRACE data source as well as the path to the longterm mean values Data source may be CSR, JPL or ITSG.

For RL06, example data have been provided within the package. In case this option is chosen, the program directly returns the longterm mean values.

Returns the long_mean path

load_longterm_mean(source='', use_sample_mean=0)

summary

Parameters:

Name Type Description Default
source str

description. Defaults to "".

''
use_sample_mean int

description. Defaults to 0.

0

Raises:

Type Description
Exception

description

Returns:

Name Type Description
_type_

description

Todo
  • Not sure if using "source = ''" is all right
  • instead of base eception is can be ValueError
Source code in pyshbundle/load_longterm_mean.py
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
def load_longterm_mean(source = "", use_sample_mean = 0):
    """_summary_

    Args:
        source (str, optional): _description_. Defaults to "".
        use_sample_mean (int, optional): _description_. Defaults to 0.

    Raises:
        Exception: _description_

    Returns:
        _type_: _description_

    Todo:
        + Not sure if using "source = ''" is all right
        + instead of base eception is can be ValueError
    """
    if use_sample_mean == 1:
        print("Loading preloaded RL06 long term mean values")
        print("Please ensure that your data is RL06 \nIf not, please manually input long term mean by setting use_sample_mean = 0")

        if str.upper(source) == 'CSR':
            long_mean = pkg_resources.resource_filename('pyshbundle', 'data/RL06_long_mean/SH_long_mean_csr.npy')
        elif str.upper(source) == 'JPL':
            long_mean = pkg_resources.resource_filename('pyshbundle', 'data/RL06_long_mean/SH_long_mean_itsg.npy')
        elif str.upper(source) == 'ITSG':
            long_mean = pkg_resources.resource_filename('pyshbundle', 'data/RL06_long_mean/SH_long_mean_jpl.npy')
        else:
            raise Exception("Incorrect selection of source")
        print("Successfully loaded preloaded longterm means")
    else:
        print("Please download and provide the longterm GRACE SH mean values")
        print("Instructions to download the longterm GRACE SH mean values may be referred to in https://github.com/mn5hk/pyshbundle/blob/main/docs/index.md#how-to-download-data")
        long_mean = str(input("Enter the longterm mean for the SH values in the numpy (.npy) format"))
        print("Successfully loaded path to long term mean:", long_mean)

    return long_mean

Physical and Geodetic Constants

This script contains some of the major relavant Physical and Geodetic(GRS80) constants:

  • clight speed of light - \(2.99792458e+8\) \(m/s\)
  • G Gravitational constant- \(6.67259e-11\) $ rac{m^3} {kg \cdot s^2}$
  • au astronomical unit - \(149.597870691e+9\) \(m\)

  • ae semi-major axis of ellipsoid GRS 80- \(6378137\) m

  • GM geocentric grav. constant GRS 80- \(3.986005e+14\) $ rac{m^3}{s^2}$
  • J2 earth's dynamic form factor GRS 80 - \(1.08263e-3\) [unitless C20 unnormalized coefficient]
  • Omega mean ang. velocity GRS 80 - $7.292115e-5 $ rac{rad}{s}$

  • flat flattening - $ rac{1}{298.257222101}$

Reference